Convolutional neural network that analyzes self-generated images in a variety of languages to find etymological similarities

Last update: Feb 03, 2022

Related tags

Deep Learning CognateCNN

Overview

CognateCNN

This project is a convolutional neural network (CNN) that analyzes self-generated images in a variety of languages to find etymological similarities. Specifically, the goal is to prove that computer vision can be used to identify cognates known to exist, and perhaps lead linguists to evidence of unknown cognates. For a complete project description, please read the included paper "Cognate Analysis Using CNN". This describes in detail the purpose, implementation, and success of the project.

Installation Notes

This implementation requires the installation of a Python version >= 3.5. Specifically, Python 3.8 was used in writing this software. A minimum of 3.5 is required in order to make full use of the built-in pathlib module. https://www.python.org/downloads/release

Python pip dependencies in this project include: matplotlib numpy tensorflow (should be TensorFlow 2) wikipedia pillow unidecode

This implementation can be run within any interface that can read Python files. For example, any IDE that supports Python such as PyCharm would work. This implementation can also be run at the command line. One needs only to preface the ImageNeuralNet.py file with the Python 3 python.exe, either by referencing the fully qualified path to your Python 3 installation directory, or by adding the Python 3 python.exe to your Windows PATH. If using Linux, there are multiple ways you could reference it too, such as creating an alias to the python.exe like 'alias python="path/to/python.exe"

Note: This implementation has not been tested on a Linux machine, although there are no Windows dependencies that I am aware of.

The user should see the following files:

ImageNeuralNet.py
TestImageNeuralNet.py
TestSingleImage.py

Each of these files will also create a local directory when run. These local directories will be used to store the generated image data.

ImageNeuralNet.py is the crux of this project. It accepts no parameters. When run it will generate approximately 45,000 - 55,000 images using the wikipedia pages specified in the code. It will then pass the image data into a convolutional neural network to be processed and save the weights of the CNN model to a checkpoints folder within the local directory.

TestImageNeuralNet.py is mainly used for testing the weights generated by ImageNeuralNet.py. It will load the CNN model from the checkpoints folder and run against wikipedia pages specified in the code.

TestSingleImage.py when run will prompt the user to input any word in any language. The code will then load the CNN model from the checkpoints folder and output what it believes is the language the word most likely belongs to among the possibilities of English, Spanish, German, and Italian.

The pre-trained CNN weights within the checkpoints folder have been omitted in order to save space, as they are 30-40GB large. These can be offered too though if requested.

Convolutional neural network that analyzes self-generated images in a variety of languages to find etymological similarities

Related tags

Overview

CognateCNN

Installation Notes

Owner

QQ Browser 2021 AI Algorithm Competition Track 1 1st Place Program

Julia package for contraction of tensor networks, based on the sweep line algorithm outlined in the paper General tensor network decoding of 2D Pauli codes

CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Detection in Remote Sensing Images

A toolkit for making real world machine learning and data analysis applications in C++

Official Pytorch Implementation of: "Semantic Diversity Learning for Zero-Shot Multi-label Classification"(2021) paper

Official Code For TDEER: An Efficient Translating Decoding Schema for Joint Extraction of Entities and Relations (EMNLP2021)

Python lib to talk to pylontech lithium batteries (US2000, US3000, ...) using RS485

[ICCV'2021] "SSH: A Self-Supervised Framework for Image Harmonization", Yifan Jiang, He Zhang, Jianming Zhang, Yilin Wang, Zhe Lin, Kalyan Sunkavalli, Simon Chen, Sohrab Amirghodsi, Sarah Kong, Zhangyang Wang

This is the pytorch implementation for the paper: Learning Accurate Performance Predictors for Ultrafast Automated Model Compression, which is in submission to TPAMI

Predicting Axillary Lymph Node Metastasis in Early Breast Cancer Using Deep Learning on Primary Tumor Biopsy Slides

The official implementation of A Unified Game-Theoretic Interpretation of Adversarial Robustness.

Spectralformer: Rethinking hyperspectral image classification with transformers

Code implementation of "Sparsity Probe: Analysis tool for Deep Learning Models"

Code for our ALiBi method for transformer language models.

Official repository for MixFaceNets: Extremely Efficient Face Recognition Networks

EfficientNetV2-with-TPU - Cifar-10 case study

Implementation of Lie Transformer, Equivariant Self-Attention, in Pytorch

PyTorch code for Composing Partial Differential Equations with Physics-Aware Neural Networks

PED: DETR for Crowd Pedestrian Detection

Viewmaker Networks: Learning Views for Unsupervised Representation Learning

Convolutional neural network that analyzes self-generated images in a variety of languages to find etymological similarities

Related tags

Overview

CognateCNN

Installation Notes

Owner

QQ Browser 2021 AI Algorithm Competition Track 1 1st Place Program

Julia package for contraction of tensor networks, based on the sweep line algorithm outlined in the paper General tensor network decoding of 2D Pauli codes

CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Detection in Remote Sensing Images

A toolkit for making real world machine learning and data analysis applications in C++

Official Pytorch Implementation of: "Semantic Diversity Learning for Zero-Shot Multi-label Classification"(2021) paper

Official Code For TDEER: An Efficient Translating Decoding Schema for Joint Extraction of Entities and Relations (EMNLP2021)

Python lib to talk to pylontech lithium batteries (US2000, US3000, ...) using RS485

[ICCV'2021] "SSH: A Self-Supervised Framework for Image Harmonization", Yifan Jiang, He Zhang, Jianming Zhang, Yilin Wang, Zhe Lin, Kalyan Sunkavalli, Simon Chen, Sohrab Amirghodsi, Sarah Kong, Zhangyang Wang

This is the pytorch implementation for the paper: *Learning Accurate Performance Predictors for Ultrafast Automated Model Compression*, which is in submission to TPAMI

Predicting Axillary Lymph Node Metastasis in Early Breast Cancer Using Deep Learning on Primary Tumor Biopsy Slides

The official implementation of A Unified Game-Theoretic Interpretation of Adversarial Robustness.

Spectralformer: Rethinking hyperspectral image classification with transformers

Code implementation of "Sparsity Probe: Analysis tool for Deep Learning Models"

Code for our ALiBi method for transformer language models.

Official repository for MixFaceNets: Extremely Efficient Face Recognition Networks

EfficientNetV2-with-TPU - Cifar-10 case study

Implementation of Lie Transformer, Equivariant Self-Attention, in Pytorch

PyTorch code for Composing Partial Differential Equations with Physics-Aware Neural Networks

PED: DETR for Crowd Pedestrian Detection

Viewmaker Networks: Learning Views for Unsupervised Representation Learning

This is the pytorch implementation for the paper: Learning Accurate Performance Predictors for Ultrafast Automated Model Compression, which is in submission to TPAMI