Category: Python

Pandas in a nutshell

Introduction¶ Pandas is an open source data analysis library in Python and it is extensively used for Data analysis, Data munging and Cleaning. Pandas has a high performing and user friendly data structures.¶ What makes Pandas a great choice for data analysis? It is it’s rich and highly performant data structures which are built on…

A Glimpse of Jupyterlab

Introduction So finally there is a tool which perceive as one stop shop for all the work you do as a DataScientist. Jupterlab, it’s still in development phase as mentioned in it’s github page. However you can still install and start using and see how amazing this tool is, it has blended File browser, Terminal,…

Data Analysis of IMDB Data

[youtube https://www.youtube.com/watch?v=mS3dzczv1ZQ?version=3&rel=1&fs=1&autohide=2&showsearch=0&showinfo=1&iv_load_policy=1&start=1&wmode=transparent]

We all are surrounded by data and it reveals lot of things to us to make our decisions and recommends the next steps. Data is collected from different sources such as Web, Database, log files etc. and then it is thoroughly cleaned and reshaped, and further used for analysis and explored to determine the hidden patterns and trends which is really essential for any business decision making, Extracting data from web is always easy with the help of API’s but what if website doesn’t provide any API’s, In such case, Web Scraping is an excellent way to extract the unstructured data from web and put that in structured format like excel,csv, database etc..

Learn Python for Data Science from Scratch for Beginners

Why Python?

Python is a multipurpose programming language and widely used for Data Science, which is termed as the sexiest job of this century. Data Scientist mine thru the large dataset to gain insight and make meaningful data driven decisions. Python is used as general purposed programming language and used for Web Development, Networking, Scientific computing etc. We will be discussing further about the series of awesome libraries in python such as numpy, scipy & pandas for data manipulation & wrangling and matplotlib, seaborn & bokeh for data visualization.

So Python & R is just used as a tool for data science but for being a data scientist you need to know more about the statistical & mathematical aspects of the data and on top of everything a good domain knowledge is must.