Category Archives: PYTHON


Regular Expressions in Python

A regular expression is a distinctive sequence of characters that helps the user to match or find other strings or sets of strings, using a specialized syntax held in a pattern. Regular expressions are widely used in UNIX world. The module re offers full support for Perl-like regular expressions in Python.

In this article we will be covering important functions, which would be used to handle regular expressions.


match =, str)

The method takes a regular expression pattern and a string and searches for that pattern within the string. If the search is successful, search() returns a match object or None. Therefore, the search is usually followed by an if-statement to test if the search is succeeded or not.

str = ‘an example word:cat!!’
match =’word:\w\w\w’, str)
# If-statement after search() tests if it succeeded
if match:
print ‘found’, ## ‘found word:cat’
print ‘did not find’

The code match =, str) stores the search result in a variable named “match”. Then the if-statement tests the match — if true the search succeeded and is the matching text (e.g. ‘word:cat’). Otherwise if the match is false, then the search will not be succeeded, and there is no matching text.

The ‘r’ at the start of the pattern string designates a python “raw” string which passes through backslashes without change which is very convenient for regular expressions.

if-statement tests the match — if true the search succeeded and is the matching text (e.g. ‘word:cat’). If the match is false, then the search will not be succeeded, and there is no matching text.

The ‘r’ at the start of the pattern string designates a python “raw” string which passes through backslashes without change which is very convenient for regular expressions.


The function attempts to match re pattern to string with optional flags.

Here is the syntax for this function:
re.match(pattern, string, flags=0)

The re.match function returns a match object on success. The usegroup(num) or groups() function is used to match objects to get matched expression.

The match function

import re
line = “Cats are smarter than dogs”
matchObj = re.match( r'(.*) are (.*?) .*’, line, re.M|re.I)
if matchObj:
print “ : “, matchObj. group ()
print “ : “, matchObj. group (1)
print “ : “, matchObj. group (2)
print “No match!!”

When the above code is executed, it gives the following result: : Cats are smarter than dogs : Cats : smarter


Data Mining With Python

Data mining is the extraction of implicit, formerly unknown, and potentially useful information from data. It is applied in a wide range of domains and its practices have become fundamental for several applications.

This article is about the tools used in real Data Mining for finding and describing structural patterns in data using Python. In recent years, Python has been used for the development of data-centric.


The very first step of a data analysis consists of obtaining the data and loading the data into the user’s work environment. User can easily download data using the following Python capability:

import urllib2

url = ‘

u = urllib2.urlopen(url)

localFile = open(‘iris.csv”, ‘w’)



In the above snippet user has used the library urllib2 to access a file on the website and saved it to the disk using the methods of the File object provided by the standard library. The file contains the iris dataset, which is a multivariate dataset that consists of 50 samples from each of three species of Iris flowers. Each sample has four features that is the length and the width of sepal and petal, in centimetres.

The dataset is stored in the CSV format. It is appropriate to parse the CSV file and to store the informa tion that it contains using a more suitable data structure. The dataset has 5 rows, the first 4 rows contain the values of the features while the last row signifies the class of the samples. The CSV can be easily parsed using the function genfromtxt of the numpy library:

from numpy import genfromtxt, zeros

# read the first 4 columns

data = genfromtxt(‘iris.csv’,delimiter=’,’,usecols=(0,1,2,3))

# read the fifth column

target = genfromtxt(‘iris.csv’,delimiter=’,’,usecols=(4),dtype=str)

In the above example user has created a matrix with the features and a vector that contains the classes. The user can also confirm the size of the dataset looking at the shape of the data structures loaded:

print data.shape

(150, 4)

print target.shape


print set(target) # build a collection of unique elements

set([‘setosa’, ‘versicolor’, ‘virginica’])

An important task when working with a new data is to understand what information the data contains and how it is structured. Visualization helps the user to explore the information graphically in such a way to gain understanding and insight into the data.


Classification is a data mining function that allocates samples in a dataset to target classes. The models that implement this function are called classifiers. There are two basic steps for using a classifiers: training and classification. The library sklearn contains the implementation of many models for classification.

t = zeros(len(target))

t[target == ‘setosa’] = 1

t[target == ‘versicolor’] = 2

t[target == ‘virginica’] = 3

The classification can be done with the predict method and it is easy to test it with one of the sample:

print classifier.predict(data[0])

[ 1.]

print t[0]


In this case the predicted class is equal to the correct one (setosa), but it is important to assess the classifier on a wider range of samples and to test it with data not used in the training process.


We do not have labels attached to the data that tell us the class of the samples. The user has to analyse the data in order to group them on the basis of a similar criteria where groups are sets of similar samples. This kind of analysis is called unsupervised data analysis. One of the most famous clustering tools is the k-means algorithm, which can be run as follows:

from sklearn.cluster import KMeans

kmeans = KMeans(k=3, init=’random’) # initialization # actual execution

The snippet above runs the algorithm and groups the data in 3 clusters (as specified by the parameter k). Now the user can use the model to assign each sample to one of the clusters:

c = kmeans.predict(data)

And the user can evaluate the results of clustering, comparing it with the labels that they already have using the completeness and the homogeneity of the score:

from sklearn.metrics import completeness_score, homogeneity_score

print completeness_score(t,c)


print homogeneity_score(t,c)


The wholeness of the score approaches 1 when most of the data points that are members of a given class are elements of the same cluster while the homogeneity score approaches 1 when all the clusters contain almost only data points that are member of a single class.

The user can also visualize the result of the clustering and compare the assignments with the real labels visually:


subplot(211) # top figure with the real classes




subplot(212) # bottom figure with classes assigned automatically






What is garbage collection and does python have it?

Garbage collection is the systematic recovery of pooled computer storage that is being used by a program when that program no longer needs the storage which frees the storage for use by other programs.

Python also have an inbuilt garbage collector, which recycles all the unused memory and frees the memory and makes it available to the heap space.



Explain multi-threading in Python?

By default Python doesn’t allow multi-threading to use multi-threding we have to use different multi-threading packages.

Multi-threading can be outsourced to the operating system (by doing multi-processing), some external application that calls your Python code, or some code that your Python code calls.



What is lambda in Python?

The lambda operator or lambda function is a way to create small anonymous functions, throw-away functions, i.e. the unnamed functions which are needed just where they have been created.



What Is Python? What are the benefits of using Python? What do you understand of PEP 8?

Python is one of the most successful interpreted language. When you write a Python script, it doesn’t need to get compiled before execution. Few other interpreted languages are PHP and Javascript.

Benefits Of Python Programming:

Python is a dynamic-typed language, this means that you don’t need to mention the date type of variables during their declaration. It allows to set variables like var1=101 and var2 =” You are an engineer.” without any error.

Python supports object orientated programming as you can define classes along with the composition and inheritance. It doesn’t use access specifiers like public or private.

Functions in Python are like first-class objects. It suggests you to assign them to variables, return from other methods, and pass as arguments.

Developing using Python is quick but running it often is slower than compiled languages. Luckily, Python enables to include the “C” language extensions so that you can optimize your scripts.

Python has several usages like web-based applications, test automation, data modeling, big data analytics, and much more. Alternatively, you can utilize it as “glue” layer to work with other languages.

PEP 8:

PEP 8 is the latest Python coding standard, a set of coding recommendations. It guides you to deliver more readable Python code.



Top 3 Python IDE

Smart programmers not only code well but also do it in style. And to become a good programmer may take longer time than you expect. However, if you choose the best Python IDE, then one can certainly reduce their coding efforts. IDLE is the Python IDE that comes with the standard Python package. It allows quick editing and execution of Python scripts. However, it lacks a lot of features that can increase speed and boost productivity.

Picking up the right IDE is crucial as it can help the programmer to automate a lot of tasks and ease up project management. So, the developer must wisely choose a development tool, that the developer shouldn’t regret using it later in the project lifecycle.

There is a number of factors that the developer might like to consider for shortlisting. We have mentioned few points to make an entry-level distinction:

  • An ideal Python IDE should support multiple platforms such as Windows, Linux, and MacOS.
  • Check if it is available for free or is an open source with a GPL license.
  • Also, confirm if the IDE has a community version which suits students and beginners who are learn ing Python.

IT companies or the professionals working with big organizations might have access to paid version of commercial IDEs. Just to name a few are Komodo, PyCharm, Sublime, and Wing IDE.

In this article, we have published 3 best python IDE’s voted as most advanced and feature rich by experienced and professional programmers.


PyCharm is a complete Python IDE loaded with a rich set of features. It is the software company, JetBrains, which is behind the development of PyCharm. And it has left no stone unturned in making this tool up to date while meeting the increasing needs of Python developers.

It is an enterprise-level product which offers two variations – the first is community edition, free for non-commercial usage and next is the premium version for advanced as well as enterprise users.

For basic users, the free version is enough to start their work. It includes almost every feature the developer might seek in an IDE – Auto code completion, quick project navigation, built-in version control support, code inspection/refactoring, PEP8 quality audit, fast error checking and correction, UI level debugging, and integrated AUT testing. The other key features include integration with IPython notebook and support for Anaconda as-well-as packages like NumPy and MatPlotLib for scientific computing.

High-level features such as remote development support, database accessibility, and ability to use extensible web development frameworks (WDF) exist only in the premium version of PyCharm.

Most of the developers say it as the best Python IDE because of its sheer scale to work with a number of WDFs like Django, Web2Py, GAPP, Flask, and Pyramid. Undoubtedly, it is one of the best IDEs for creating small to large scale web applications.


Eric is an open-source Python IDE written using Python and QT frameworks. Its name is derived from Monty Python’s Eric Idle. Despite being a non-commercial product, it has all the features needed for a professional software development.

The creator of Eric is Detlev Offenbach, a senior system engineer from Munich. He has been maintaining it from so many years so that it can compete with any of its peers. Talking about its usage and downloads, it is incomparable. The IDE is available under the GPL license for the unlimited usage.

Eric has a robust plugin manager which you can use to extend the functionality by adding appropriate plugins. The latest and stable version is Eric6 built on PyQt5/4 and Python2/3.

Some of the standard features of Eric are code completion, bracket matching, call tips, syntax highlighting, class browser, code profiling, and integrated unit tests. Developers can also make use of its form preview function while working on a QT GUI application. Below is the list that makes Eric stands against competitors like PyCharm/Wing.

  • Integrated debugger support for multithreaded or multiprocessing applications.
  • Automatic code checkers.
  • Intuitive project management.
  • Built-in Unitest support.
  • Inbuilt Python shell.
  • Addons for Regex and QT dialogs.
  • Integrated web browser.


WING is also one of the top IDE alternatives for Python developers. It is a paid solution from WingWare. The company made huge investments in Wing and added many new and relevant features. Also, it has released a number of updates over the years.

Like PyCharm, Wing also supports Windows, Linux, and Mac OS X. The company offering three types of packages; a Freeware with moderate features, a personal version for individual users, and a high-end edition for the enterprise users.

WING Python IDE is an intelligent code editor and a great debugging tool. Both these features together make Python coding easy, interactive, accurate, and fast. It has a robust graphical debugger which enables to set breakpoints, navigating through code, monitoring data, multi-process/multi-threaded code debugging and also supports remote debugging on SOC (System On Chip) devices such as Raspberry PI. Also, it comes with the ability to blend with different version control systems such as GIT, CVS, SVN, Mercurial, and Perforce. So the developers can perform check-in/check-out and manage merge within the IDE.

WING team ensured that the IDE supports all the major Python frameworks available as of today. Just to name a few of these frameworks are PyQT, PyGTK, PySide, Zope, MotionBuilder, Django, and much more. It also supports Matplotlib where the plots get updated automatically.


Python 2 Vs Python 3

Python is an extremely readable and adaptable programming language. The name was inspired by the British comedy group Monty Python; it was a major foundational goal of the Python development team to make the language fun and easy to use. It is easy to set up, and written in a relatively straightforward style with immediate feedback on errors, Python is a great choice for beginners.

Before going into the potential opportunities let’s see the key programmatic differences between Python 2 and Python 3, let’s start with the background of most recent major releases of Python.


Python 2 is a transparent and inclusive language development process than earlier versions of Python with the implementation of PEP (Python Enhancement Proposal). Python 2 has much more programmatic features including a cycle-detecting garbage collector to automate memory management, increased Unicode support to standardize characters, and list comprehensions to create a list based on existing lists. As Python 2 continued to develop, more features were added, including unifying Python types and classes into one hierarchy in Python version 2.2.


Python 3 is contemplated as the future of Python and is the version of the language that is currently in development. Python 3 was released in late 2008 to address and amend intrinsic design flaws of previous versions of the language. The focus of Python 3 development was to clear the codebase and remove redundancy. Major modifications to Python 3.0 includes, changing the print statement into a built-in function, improved the way integers are divided, and provides more Unicode support.


Following the 2008 release of Python 3.0, Python 2.7 was published on July 3, 2010 and planned as the last of the 2.x releases. The main intention behind Python 2.7 was to make it easier for Python 2.x users to port features to Python 3 by providing some measures of compatibility between the two. This compatibility support includes enhanced modules for version 2.7 like unittest to support test automation, argparsefor parsing command-line options, and more convenient classes in collections.


While Python 2.7 and Python 3 share many identical capabilities, there should not be any thought of interchangeable. Though a user can write code and useful programs in either version, it is worth in understanding that there will be some considerable differences in code syntax and handling.


In Python 2, print is considered as a statement instead of a function, which is a typical area of confusion as many other actions in Python requires arguments inside the parentheses to execute. If the user wants the console to print out “The Shark is my favourite sea creature” in Python 2 the user can do it with the following print statement:

Print “The Shark is my favourite sea creature”

In Python 3, print() is explicitly treated as a function, so to print out the same string above, the user can easily do it with the simple syntax of a function:

Print(“The Shark is my favourite sea creature”)

This change made Python’s syntax more uniform and also made it easier to change between different print functions.


In Python 2, any number that the user types without decimals is treated as the programming type called integer. While in the beginning this seems like an easy way to handle programming types, when the user tries to divide integers together then the user expects to get an answer with decimal places (called a float), as in:

5 / 2 = 2.5

However, in Python 2 integers were strongly typed and would not change to a float with decimal places even in cases that would make instinctive sense.

When the two numbers on either side of the division “/” symbol are integers, Python 2 will do floor division so that the quotient x is the number which is returned is the largest integer less than or equal to x. This means that when you write 5 / 2 to divide the two numbers, Python 2.7 returns the largest integer less than or equal to 2.5, in this case, 2:

a = 5 / 2

print a


When programming languages handle the string type i.e., a sequence of characters which can do it in a different way so that computers can convert numbers to letters and other symbols.

Python 2 uses the ASCII alphabet by default, so when you type “Hello” Python 2 will handle the string as ASCII. Limited to a couple of hundred characters at best in various extended forms, ASCII is not a very flexible method for encoding characters, especially non-English characters.

Python 3 uses Unicode by default, which saves the programmers development time, and the programmer can easily type and display many more characters directly into the program. Because Unicode supports a linguistic character.


Python is a flexible and well-documented programming language to learn, whether you choose to work with Python 2 or Python 3, one will be able to work on exciting software projects.

Though there are several key differences, it is not difficult to move from Python 3 to Python 2 with a few twists, and you will often find that Python 2.7 can easily run Python 3 code.

It is important to keep in mind that most of the developers are focused on Python 3, the language will become more refined and in-line with the evolving needs of programmers, and less support will be given to Python 2.7.