Category Archive Data & Analytics

BySlide Scope

K means clustering algorithm example using Python

K Means Clustering is an algorithm of Unsupervised Learning. You can apply this algorithm on datasets without labeled output data.Only Input data is there an we have a goal of finding regularities in data to group or cluster like items together.

You can copy the code an run it line by line in Jupyter Notebook.

Watch the videos given in the bottom of this post to understand the process clearly.

What is a Cluster – Datapoints aggregated together because of certain similarities


import numpy as np
import matplotlib.pyplot as plt
# Import the algorithm from scikitlearn https://scikit-learn.org
from sklearn.cluster import KMeans
# Get the dataset of wine https://archive.ics.uci.edu/ml/datasets/wine
names = ['Class', 'Alcohol', 'Malic acid', 'Ash', 'Alcalinity of ash', 'Magnesium', 'Total phenols', \
'Flavanoids', 'Nonflavanoid phenols', 'Proanthocyanins', 'Color intensity', 'Hue', 'OD280/OD315',\
'Proline'] data = pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data', names = names)
data.head(100)
data['Class'].value_counts().plot(kind='bar')
data.plot.scatter(x = 'Alcohol', y = 'OD280/OD315', figsize=(8,5))



data.plot.scatter(x = 'Alcohol', y = 'OD280/OD315', c= 'Class', figsize=(8,5), colormap='jet')
data.iloc[:,[12,1]].head()
# kmeans = Kmeans().fit(data)
# kmeans = KMeans(n_clusters = 2)
# kmeans.fit(X)
# kmeans.cluster_centers_
# kmeans.labels_
kmeans = KMeans(n_clusters=3, init = 'random', max_iter = 1, random_state = 5).fit(data.iloc[:,[12,1]])


centroids_df = pd.DataFrame(kmeans.cluster_centers_, columns = list(data.iloc[:,[12,1]].columns.values))
fig, ax = plt.subplots(1, 1)
data.plot.scatter(x = 'Alcohol', y = 'OD280/OD315', c= kmeans.labels_, figsize=(12,8), colormap='jet', ax=ax, mark_right=False)
centroids_df.plot.scatter(x = 'Alcohol', y = 'OD280/OD315', ax = ax, s = 80, mark_right=False)
kmeans = KMeans(n_clusters=3, init = 'random', max_iter = 150, random_state = 5).fit(data.iloc[:,[12,1]])
centroids_df = pd.DataFrame(kmeans.cluster_centers_, columns = list(data.iloc[:,[12,1]].columns.values))
fig, ax = plt.subplots(1, 1)
data.plot.scatter(x = 'Alcohol', y = 'OD280/OD315', c= kmeans.labels_, figsize=(12,8), colormap='jet', ax=ax, mark_right=False)
centroids_df.plot.scatter(x = 'Alcohol', y = 'OD280/OD315', ax = ax, s = 80, mark_right=False)

k means clustering with centroid
What is K means clustering ?
You can watch the theory here :

Applying K means clustering on wine dataset :

BySlide Scope

How to plot Boxplot in Python

A box plot is used to visualize 5 values in a dataset for the selected column(s):

  • Minimum Value
  • First Quartile or 25%
  • Median (Second Quartile) or 50%
  • Third Quartile or 75%
  • Maximum value

Box Plot is also known as Box and Whisker Plot.

Steps –

  1. Load the dataset using Pandas dataframe
  2. Select any column to visualize
  3. Plot boxplot using Pandas
    OR
  4. Plot boxplot using Seaborn

Python Code :

import pandas as pd

#load data

data = pd.read_csv(‘insurance.csv’)

data.head(10)

index age gender bmi children smoker region charges id
0 19 female 27.900 0 yes southwest 16884.92400 1
1 18 male 33.770 1 no southeast 1725.55230 2
2 28 male 33.000 3 no southeast 4449.46200 3
3 33 male 22.705 0 no northwest 21984.47061 4
4 32 male 28.880 0 no northwest 3866.85520 5
5 31 female 25.740 0 no southeast 3756.62160 6
6 46 female 33.440 1 no southeast 8240.58960 7
7 37 female 27.740 3 no northwest 7281.50560 8
8 37 male 29.830 2 no northeast 6406.41070 9
9 60 female 25.840 0 no northwest 28923.13692 10

data.describe()

age bmi children charges id
count 1338.000000 1338.000000 1338.000000 1338.000000 1338.000000
mean 39.207025 30.663397 1.094918 13270.422265 669.500000
std 14.049960 6.098187 1.205493 12110.011237 386.391641
min 18.000000 15.960000 0.000000 1121.873900 1.000000
25% 27.000000 26.296250 0.000000 4740.287150 335.250000
50% 39.000000 30.400000 1.000000 9382.033000 669.500000
75% 51.000000 34.693750 2.000000 16639.912515 1003.750000
max 64.000000 53.130000 5.000000 63770.428010 1338.000000

# In pandas boxplot one attribute, column is required to plot boxplot
# Column can take name of one column of the dataset or the list of columns
data.boxplot(column=[‘age’],figsize=[10,7])

# We can group data as well.

data.boxplot(column=[‘age’], by=[‘gender’], figsize=[10,7])

Boxplot Using Seaborn Library

BySlide Scope

मशीन लर्निंग क्या है

मशीन लर्निंग क्या है

मशीन लर्निंग एक ऐसी तकनीक है जिसमे कंप्यूटर को इस तरह से प्रोग्राम किया जाता है की वो इनपुट डाटा के आधार पे खुद से आउटपुट डाटा को प्रेडिक्ट कर सके | दिए गए इनपुट के आधार पे खुद से सीख सके |
सॉफ्टवेयर डेवलपमेंट में जरुरत के आधार पे प्रोग्राम बनाया जाता है | मशीन लर्निंग में मशीन किसी इंसान की तरह आर्टिफिशियल इंटेलिजेंस का इस्तेमाल करके टास्क को खुद से करने की काबिलियत सीखती है |

मशीन लर्निंग आर्टिफिशियल इंटेलिजेंस विषय के अंदर आती है |

मशीन लर्निंग के प्रकार –

  • सुपरवाईस्ड लर्निंग
  • अन्सुपरवाईस्ड लर्निंग
  • रीइंफोर्स्मेंट लर्निंग
BySlide Scope

Tableau Course

Tableau Training in Lucknow

Tableau training is done by students and working professionals of Data Science industry.

Tableau is an easy to use tool to analyze data written in structured formats like JSON, Excel, CSV, XML, SQL etc.

Tableau is used to create interactive dashboards to visualize data in graphical and tabular form.

[email protected]

Benefits of Tableau Course

1. Tableau is a simple and easy to learn software tool for data analytics.

2. If you are new to coding or you dont know coding at all, you can start working on Tableau immediately.

3. Tableau helps in quick, creative and interactive databases.

4. Tableau can handle large datasets easily.

5. Tableau does the work of detecting the datatype in spreadsheets automatically. Datatypes like Strings, Numeric Values, Geographical Values etc. can be easily visualized using Tableau.

 

 

 

BySlide Scope

डाटा एनालिटिक्स क्या है

डाटा एनालिटिक्स दो शब्दों से मिल के बना है – डाटा का अर्थ है आंकड़े और एनालिटिक्स का अर्थ है विश्लेषण | डाटा एनालिटिक्स का क्या अर्थ हुआ – आंकड़ों का विश्लेषण |

बिग डाटा क्या होता है ?

मार्केटिंग, रिसर्च और बिज़नस को बढाने के लिए जब आंकड़ों का विश्लेषण किया जाता है उसे डाटा एनालिटिक्स या बिग डाटा कहते हैं |

आंकड़ों को स्प्रेडशीट या टेबल में लिखना फिर उसको ग्राफ या चार्ट बना के विश्लेषण करना विश्लेषण करने के बाद प्राप्त की गयी जानकारी को सामान्य भाषा में लिख के देना जिससे की उसपर विचार करके उपयुक्त निर्णय लिया जा सकते | इस पूरी प्रक्रिया को डाटा एनालिटिक्स कहते हैं |

डाटा एनालिटिक्स का प्रयोग कहा होता है ?

डाटा एनालिटिक्स का उपयोग प्रायः सभी व्यवसायों में होता है | डाटा एनालिटिक्स का प्रयोग करने वाले कुछ प्रमुख बिज़नस सेक्टर निम्नलिखित हैं  :

  • मार्केटिंग
  • स्वास्थ एवं फार्मा विभाग
  • लोजिस्टिक्स के क्षेत्र में
  • कृषि के क्षेत्र में
  • खेल के क्षेत्र में
  • सूचना एवं संचार के क्षेत्र में

डाटा एनालिटिक्स का क्या लाभ है ?

डाटा एनालिटिक्स हमें अपने या हमारे क्लाइंट के व्यवसाय से जुडी हर ज़रूरी जानकारी देता है | इस जानकारी से हमें उन बातोँ का पता चलता है जो हमारे व्यवसाय को ऊपर की और ले जा रही हैं और उन बातोँ का भी पता चलता है जिसकी वजह से हमारे व्यवसाय को हानि हो रही है |

हम इन आंकड़ों की मदद से नए आईडिया सोच सकते हैं और अपने व्यवसाय को कामयाबी की तरफ ले जा सकते है |

डाटा एनालिटिक्स के लिए किन टूल्स का प्रयोग किया जाता है ?

डाटा एनालिटिक्स के लिए निम्लिखित टूल्स का प्रयोग किया जाता है :

SQL डेटाबेस

R प्रोग्रामिंग लैंग्वेज

Python प्रोग्रामिंग लैंग्वेज

MS-Excel

BySlide Scope

How to load data from Facebook Insights to Google Data Studio Cloud

Facebook Insights to Google Data Cloud Studio has been live for some time and it has some amazing features.

Google Data Studio can be used to fetch data from other sources.

Google Data Cloud can fetch Graphical Data from Facebook Insights.

We will guide you step by step how to create stunning Graph and Chart Reports using Google Data Cloud.

Open this url and login with your existing Gmail ID.

https://datastudio.google.com/navigation/reporting

Step 1 – Click on Start a New Report.

How to load data from Facebook Insights to Google Data Studio Cloud

Step – 2 Click on Create New Data Source

google-data-cloud-to-facebook-insights-2

Step – 3  Go to the Connectors panel on the left Sidebar

google-data-cloud-to-facebook-insights-3-connectors

Step – 4 Click on Explore Connectors

explore connectors

You will see many Data Sources here that will be listed alphabetically. Like – Adobe Analytics, Adwords etc.

Here you have to look for Facebook Insights

After Clicking Facebook Insights it will appear in the left Connectors Sidebar.

(You can additionally connect to facebook ads as well.)

You will see a message – Data Studio requires authorization to use this community connector.

Click on Authorize Button.

You have to select Facebook Insights by Supermetrics.

facebook insights in google data cloud

Next you will see this message

Facebook Insights requires authorization to connect to data.

Login to Facebook and give permissions.

google-data-cloud-to-facebook-insights-authorize

supermetrics for data cloud

Sepermetrics will now fetch the data from your Facebook Pages.

authorize facebook metrics

You will be asked to select pages. You can select one page or select all using the check boxes.

You will see a list of selected Pages.

google-data-studio-to-facebook-insights-selected-pages-10 facebook insights to google data studio

Now click on CONNECT in the top right corner.

In the next screen you will see fields that you can select for detailed report.

You can additionally create your own calculated fields as well.

Now you have to click ADD To Reports

You will see a popup and here also you have to click add to reports.

Once all the data is loaded you will be able to customize the reports and analyze it at one place. You can also show these reports to your clients.

BySlide Scope

Data Analytics Course: Data Analytics Certification Training

Big amount of Data is needed for efficient growth of any organization. Data Analytics is the process of getting meaningful and actionable conclusions by detailed examination of any information. Data Analytics Course focuses on improving the technical, qualitative and quantitative ability of any individual to analyze data from various resources. Data in every sector is growing day by day at a great pace and it is expected that by the year 2020 the data will be forty five times more as compared to today’s data.

The rate at which data is increasing is very high as compared to the persons who can analyze data efficiently. Data analytics is needed mostly in following fields:

  • Arts and Humanities
  • Business
  • Computer Science
  • Data Science
  • Life Sciences
  • Math and Logic
  • Personal Development
  • Physical Science and Engineering
  • Social Sciences
  • Language Learning

You can see that almost every business sector is covered in the list.

In Information Technology there are many tools that can be used to analyze data.

Tools that will be covered in this Course are:

  • Python
  • Git
  • Advanced MS Excel
  • Tableau
  • R Studio
  • R
  • Jupyter
  • SAAS
  • SQL

Why you should chose us as a mentor for Data Analytics Course?

  • We have faculty with 12 + Years of Experience in Information Technology Consulting and Masters in Business Administration Degree in Management Science from Lucknow University.
  • The syllabus of this course is developed by professionals with more than 25 years of experience.
  • You will work on live projects and get practical experience of methods and techniques used in large enterprises for data analytics.
  • We have small batches for students which ensures proper attention on queries of each and every student.
  • We focus on providing the core knowledge and concepts of data analytics that can be applied on any field of study or work.

Is this course is beneficial for programmers only?

This course is helpful for Programmers and Non Programmers both. This course is designed in such a way that anyone can easily learn everything.

Key Elements of Syllabus of Data Analytics Course

data analytics course in lucknow for business analysis

  • Descriptive Statistics
  • Inferential Statistics
  • Regression & ANOVA (Analysis of Variance)
  • Machine Learning: Introduction and Concepts
  • Unsupervised Learning and Challenges for Big Data Analytics
  • Web Analytics for Digital Marketing
  • Writing simple and complex SQL queries to fetch data from single and multiple tables.

We have provided successful job oriented training and placement assistance for Students belonging to Lucknow, Kanpur, Varanasi, Sultanpur, Faizabad, Delhi, Pune and Mumbai.

Book Your Free Demo Class of Data Analytics Now – Contact Us

Here is a trend Analysis of Data Analytics Vs Data Analysis in India for the Last 12 Months