Moozonian
Web Images Developer News Books Maps Shopping Moo-AI
Showing results for Initial Vector Art Vector
GitHub Repo https://github.com/leenasuva/Topic-Analysis---BERT-Tokenizer

leenasuva/Topic-Analysis---BERT-Tokenizer

The problem highlights the use of machine learning algorithms to categorize different comments scraped from an online platform and make relevant predictions about the topics associated with those comments. There are a total of 40 topics to classify these comments. Even though the problem seems like a simple classification problem, as we dive deeper to understand the data, we realize that the real problem asks us to make sense of the comments mentioned in the dataset and then assign categories. Since the number of topics/classes is much greater than any common classification problem, the expected accuracy won’t be too high. These days, Topic Modeling and Classification have received tremendous popularity when analyzing products and services for various brands, during election times to measure popularity, discover public sentiments around multiple issues, etc. Primarily deriving meaningful topics from these comments is incredibly challenging because of variations in language, insertion of emojis, and use of partial and profane comments. It is essential to choose a scheme that translates the comments to word embeddings to calculate some similarity between those comments to assign relevant topics; it is also imperative to translate the context and meaning of those comments and cluster them to relevant topics. There are multiple approaches to Topic Modeling, such as Latent Dirichlet Analysis (LDA) and Probabilistic Latent Semantic Analysis (LSA). These benchmark techniques utilized for such problems seem to provide viable results. The initial approach was to use Tf-Idf and Word2Vec to vectorize the comments and then use state-of-the-art classification techniques to assign topics to these vectors. When utilized, bag-of-Words with Tf-Idf and Word Embedding with Word2Vec would pose a significant hidden problem. The main problem with these approaches is that they treat the exact words with different meanings identically without adding any context to them. For example, the term “bank” in “Peter is fishing near the bank.” and “Two people robbed the state bank on Monday.” would have the same vectors in this representation. This approach would give us misleading results, and therefore, to improve the performance of our prediction mechanisms, it is essential to switch to a process that finds a way to translate the context of the words. Transformers: a reasonably new modeling technique, presented by Google’s research professionals in their seminal paper “Attention is All You Need,” tackles the exact problem. Google’s BERT (Bidirectional Encoder Representations from Transformers) combines ELMO context embedding and several Transformers, plus it’s bidirectional (which was a big novelty for Transformers). The vector assigned to a word using BERT is a function of the entire sentence; therefore, a word can have different vectors based on the context. ELMO is a word embedding technique that utilizes LSTMs to look at each sentence and then assigns those embeddings.
GitHub Repo https://github.com/sadaf-ali/Ear-recognition-in-3D-using-2D-curvilinear-features

sadaf-ali/Ear-recognition-in-3D-using-2D-curvilinear-features

This study presents a novel approach for human recognition using co-registered three-dimensional (3D) and 2D ear images. The proposed technique is based on local feature detection and description. The authors detect feature key-points in 2D ear images utilising curvilinear structure and map them to the 3D ear images. Considering a neighbourhood around each mapped key-point in 3D, a feature descriptor vector is computed. To match a probe 3D ear image with a gallery 3D ear image for recognition, first highly similar feature key-points of these images are used as correspondence points for an initial alignment. Afterwards, a fine iterative closest point matching is performed on entire data of the 3D ear images being matched. An extensive experimental analysis is performed to demonstrate the recognition performance of the proposed approach in the presence of noise and occlusions, and compared with the available state-of-the-art 3D ear recognition techniques. The recognition rate of the proposed technique is found to be 98.69% on the University of Notre Dame-Collection J2 dataset with an equal error rate of 1.53%.