Cristiano Ronaldo
Introduction
This dataset involves all the data about CR7. But before let’s introduce the star of this article, even if I think that he’s famous even for those who are interested in football.
Cristiano Ronaldo dos Santos Aveiro, a notable Portuguese professional footballer renowned for his amazing career as a striker, was born on February 5, 1985, in Hospital Dr. Nélio Mendonça, Funchal, Portugal. He presently plays an important part for the Premier League Club Manchester United and serves as captain of the Portugal national team.
A forward for the Portugal national football team, Ronaldo’s extraordinary career has brought him to don the number 7 jersey, and his on-field accomplishments have made him a trending topic. He commands attention on the field thanks to his 1.87 m height. Ronaldo’s personal life is highlighted off the field by his relationship with Georgina Rodriguez, which started in 2017.
Let’s move to the serious part of this article, the analysis of our dataset.
Step1: Load the Data
df=pd.read_csv("your_path/data.csv")
df_o=pd.read_csv("your_path/data.csv/overall.csv")
Step2: Explore the dataset:
We will use this command line to know the count of the unique value, and also to know the highest and lowest value.
pd.DataFrame(df.apply(lambda col: len(col.unique())),columns=["Unique Values Count"])
df.describe(include=['object']).T
Step3: EDA, Data Visualization
- Goal Per Competition
px.histogram(
df,
x='Competition',
title="Goals per competition",
log_x=False,
log_y=False,
#symbol='title',
#markers=True,
#width=800,
height=500,
color='Club',
hover_name='Club',
hover_data=['Competition','Club'])
2. Goals per season
px.histogram(
df,
x='Season',
title="Goals per season",
log_x=False,
log_y=False,
#symbol='title',
#markers=True,
#width=800,
height=500,
color='Club',
hover_name='Club',
hover_data=['Competition','Season','Club'])
3. Goals per club
px.histogram(
df,
x='Club',
title="Goals per Clubs - Seasons",
log_x=False,
log_y=False,
#symbol='title',
#markers=True,
#width=800,
height=500,
color='Season',
hover_name='Season',
hover_data=['Competition','Season','Club'])
#Goals per CLubs - Competition, we replace the hover_name with competition
4. Goals per playing Position
px.histogram(
df,
x='Playing_Position',
title="Goals per playing Position",
log_x=False,
log_y=False,
#symbol='title',
#markers=True,
#width=800,
height=500,
color='Club',
hover_name='Club',
hover_data=['Playing_Position','Competition','Season','Club'])
5. Assist
sns.set(rc={'figure.figsize':(30,5)})
plt.xticks(rotation='vertical')
p=sns.countplot(df['Goal_assist'],order=df.Goal_assist.value_counts().sort_values(ascending=False).index)
p.axes.set_title("Goals Assist",fontsize=30)
mins=list(map(str, df.Goal_assist.value_counts().sort_values(ascending=False).index))
for min in df['Goal_assist']:
if min not in mins:
mins.append(min)
mins1=mins[:int(len(mins)/5)]
mins2=mins[int(len(mins)/5):int(2*len(mins)/5)]
mins3=mins[2*int(len(mins)/5):int(3*len(mins)/5)]
mins4=mins[3*int(len(mins)/5):int(4*len(mins)/5)]
mins5=mins[int(4*len(mins)/5):]
sns.set(rc={'figure.figsize':(20,5)})
plt.xticks(fontsize=15,rotation='vertical')
p=sns.countplot(df['Goal_assist'],order=mins1)
p.axes.set_title("Goals Assisted by",fontsize=30)
6. Total Goals timeline
dfl = df[['Date']]
dfl['Date'] = pd.to_datetime(dfl['Date']).dt.strftime('%y-%m-%d')
dfl['Goal_no']=list(range(1,699))
dfl['Goal']=list(1 for i in range(1,699))
trace1 = go.Scatter(x=dfl.Date,
y=dfl.Goal_no,
name = "CR7 Total Goals Time graph",
line = dict(color = 'blue'),
opacity = 0.4)
layout = dict(title='CR7 Goals',)
fig = dict(data=[trace1], layout=layout)
iplot(fig)
Conclusion
As you see in this article CR7 made a huge work and scored a lot of goals, but the question is CR7 or Messi?