Python – knowledgebase JJ

Finding Text Similarity using Python

Cosine similarity is a measure of similarity between two non-zero vectors of an inner product space that measures the cosine of the angle between them. Similarity = (A.B) / (||A||.||B||) where A and B are vectors. nltk.tokenize: It is used for tokenization. Tokenization is the process by which big quantity of text is divided into… Continue reading Finding Text Similarity using Python

DJANGO FRAMEWORK BASICS

CREATING PROJECT django-admin startproject projectname eg:django-admin startproject myproject cd myproject myproject/ manage.py myproject/ init.py settings.py urls.py wsgi.py settings.py DEBUG = True for sqlite3 DATABASES = { ‘default’: { ‘ENGINE’: ‘django.db.backends.sqlite3’, ‘NAME’: os.path.join(BASE_DIR, ‘db.sqlite3’), } } for mysql DATABASES = { ‘default’: { ‘ENGINE’: ‘django.db.backends.mysql’, ‘NAME’: ‘databasename’, ‘USER’: ‘root’, ‘PASSWORD’: ”, ‘HOST’: ‘localhost’, ‘PORT’: ”, }… Continue reading DJANGO FRAMEWORK BASICS

Basic Libraries Required for Python programming:

1. Pandas: Pandas is a high-level data manipulation tool developed by Wes McKinney. It is built on the Numpy package and its key data structure is called the DataFrame. DataFrames allow you to store and manipulate tabular data in rows of observations and columns of variables. 2.Numpy NumPy is the fundamental package for scientific computing… Continue reading Basic Libraries Required for Python programming:

Implementing the Naive-Bayes Machine learning Model

It is a classification technique based on Bayes’ theorem with an assumption of independence between predictors. A Naive Bayes classifier assumes that the presence of a particular feature in a class is unrelated to the presence of any other feature. Naive Bayesian model is easy to build and particularly useful for very large data sets.… Continue reading Implementing the Naive-Bayes Machine learning Model

DECISION TREE CLASSIFICATION ALGORITHM IN MACHINE LEARNING

INDRODUCTION Decision Tree is a Supervised learning technique that can be used for both classification and Regression problems, but mostly it is preferred for solving Classification problems. It is a tree-structured classifier, where internal nodes represent the features of a dataset, branches represent the decision rules and each leaf node represents the outcome. Why use… Continue reading DECISION TREE CLASSIFICATION ALGORITHM IN MACHINE LEARNING

Python code to insert CSV data fields to a local Database table using Django

We can use this code to insert the data in a CSV file to a local database table. Just change the table name, database name, and data fields with your respective field name and others.

Cron job in django

SET UP installation: using the following command pip install django-crontab add it to installed apps in django settings.py: INSTALLED_APPS = ( ‘django_crontab’, … ) now create a new method that should be executed by cron every 5 minutes, f.e. in myapp/cron.py: now add this to your settings.py: CRONJOBS = (‘*/5 * * * *’, ‘myapp.cron.cron’)

Stemming: Implementation using Python code

A normalizing method in Python JIRA CODE: JJ-134 Stemming:The idea of stemming is a sort of normalizing method. Many variations of words carry the same meaning, other than when tense is involved.There are mainly two errors in stemming – Over stemming and Under stemming. Over stemming occur when two words are stemmed to same root… Continue reading Stemming: Implementation using Python code

Lemmatization: implementation using Python

For Reducing morphological variations and grouping words to one common root JIRA CODE – JJ-134 It is the process of grouping together the different inflected forms of a word so they can be analysed as a single item. Lemmatization is similar to stemming but it brings context to the words. So it links words with… Continue reading Lemmatization: implementation using Python

K-Nearest Neighbor Algorithm(K-NN Algorithm) in Python

Implementation code of K-NN Algorithm using Python language JIRA CODE – JJ – 134 This algorithm is used to solve the classification model problems. K-nearest neighbor or K-NN algorithm basically creates an imaginary boundary to classify the data. When new data points come in, the algorithm will try to predict that to the nearest of… Continue reading K-Nearest Neighbor Algorithm(K-NN Algorithm) in Python