High level, unequal text length forces us to introduce new hyper-parameters OR work-around in our model. These parameters OR work-around are not fundamental to NLP but are present to only handle unequal text lengths. Unequal Text Length in Neural Models The basic recipe for processing text in neural models is as follows. Convert text to tokens Convert tokens to token IDs Look up embedding vectors for IDs and construct 3D tensor that are input to subsequent layers. Now, most neural network APIs are “define-and-run” so you cannot change the dimensions of various matrices and tensors during runtime. So, how do we handle various text lengths. Fix the text length. Use padding OR chopping to fix each sequence to the select text length. This is the most common strategy. Use bucketing. We define multiple models for different text lengths. Then divide incoming training data into different buckets. Again not very elegant. So, both of these techniques are adding complexity to our model. Text length is a hyper-parameter that you have to choose. You might use hyper-parameter optimization. Total number of buckets is a hyper-parameter. Handling different buckets adds more code to your system. Even more painful when you are deploying this model in production. Dynamic APIs provide a solution to this problem. Frameworks like PyTorch can handle variable length sequences. However, this will cause us to not batch our training data and we will loose massively on performance. There are many more examples of how different models handle unequal text lengths. Generally, all of them introduce more complexity to the model.
The basic rule, however, is to limit the text length. 2. Algorithms are not for your every day use. There are many things which you can do with deep learning. However, the most common application is for the following use cases which are beyond the scope of artificial intelligence. 1. Machine Learning. Deep learning is an example of Machine Learning and AI which has taken off in the past decade. It has revolutionized the way in which the applications in business are being approached. Today, we find many machines on our offices that are capable of recognizing faces, labeling objects and translating into other languages. Deep learning in these applications uses deep networks, which are essentially massive neural networks. The networks are made up of 'hidden layers,' which are simply layers of nodes that pass information to the next layer of hidden layers, thereby learning how to recognize patterns in the data. The networks.