Getting started with Python Pandas
3 min read
To begin the setup, go to anaconda.com/download. Here, you can find the installers for all machine types - Windows, Unix, and Mac. Download the respective installer. Anaconda is a Python distribution that aims to provide everything we need to do data science tasks. Follow the instructions in the installer wizard to complete the installation. Now you are good to begin. Open a terminal and type 'jupyter notebook'. This will launch jupyter notebook in localhost:8888/ port and opens the file directory by default. You will have two more tabs - 'Running' and 'Clusters'. On the top right, select New -> Jupyter Notebook. You can rename the filename. These files would have an ipynb extension.
In order to print a message and check the output, type 'print ('HELLO WORLD')' without quotes. Then click on the Run button on top to see the output right under the program.
Let us now use a dataset from Kaggle. You need to sign in for downloading any dataset. There are a few ways you can sign up, check them and use a way of your comfort to sign up and proceed. Now search for 'video game sales'. Click on the first dataset that has some 16 thousand rows and 11 columns. Download the .csv file and save it in the same folder where the respective Jupyter notebook is saved. Now let us come back to the notebook at localhost:8888 and type the following instructions:
import pandas as pd fileDetails = pd.read_csv('vgsales.csv') fileDetails
Now, run the file to see the table from vgsales file, as shown in the below image.
It was great to get started with pandas. I am learning more about Python Pandas and will post more about that in the next posts. Thank you for reading till the end. I strongly suggest you just get started with it. An effort is more important than success and the journey should be the focus and things will fall in place.