Global covid data analysis in python code

Global covid data come from European cdc and could be downloaded from https://www.ecdc.europa.eu . But the program download it in real time.

The code structure is:

1: Download data, or import files

2: Format the data, that is, convert the data of cases deaths to int

3: accumulate cases, deaths

4: Sort by total cases, with the largest first

5: Take 20 countries ranked first,

6: Analyze the data of the last 20 days

The code is follows:

@author: liwenz
"""

import json
from datetime import datetime, timedelta
import pandas as pd
import requests
 
#with open('corid2019.json','r') as f:
#    data=json.load(f)
#https://opendata.ecdc.europa.eu/covid19/casedistribution/json/    
url='https://opendata.ecdc.europa.eu/covid19/casedistribution/json/'
r = requests.request('GET', url)
data = r.json()

df = pd.DataFrame.from_records(data['records'])
df['day'] =df['day'].astype("int")
df['month'] =df['month'].astype("int")
df['year'] =df['year'].astype("int")
df['cases'] =df['cases'].astype("int")
df['deaths'] =df['deaths'].astype("int")
df['date']=df['dateRep'].apply(lambda x:datetime.strptime(x, "%d/%m/%Y")) 

today=datetime.now()
print(today.day,today.month,today.year)
daybefore20=today-timedelta(days=15)
day1=daybefore20.day
month1=daybefore20.month
year1=daybefore20.year
print(daybefore20)

df1=df[df['date']>daybefore20]
df2=df.groupby('countryterritoryCode').agg(
        country=('countriesAndTerritories','last'),
        sumcase=('cases',sum),
        sumdeath=('deaths',sum),
        popu=('popData2019','last'))
df2.sort_values(by=['sumcase'], inplace=True, ascending=False)
a=df2.head(25)
b=a.index.values.tolist()
i=0;
for x in b:
    i=i+1
    print(i, a.loc[x,'country'],a.loc[x,'sumcase'],a.loc[x,'sumdeath'],a.loc[x,'popu'])
    tmp=df1[df1['countryterritoryCode']==x]
    tmp.sort_values(by=['date'], inplace=True, ascending=False)
    print(tmp.loc[:,['dateRep','cases','deaths']].to_string(index=False))

The dataset infomation could be got by df.info() as follow:

df.info()

RangeIndex: 25726 entries, 0 to 25725
Data columns (total 12 columns):
dateRep 25726 non-null object
day 25726 non-null int32
month 25726 non-null int32
year 25726 non-null int32
cases 25726 non-null int32
deaths 25726 non-null int32
countriesAndTerritories 25726 non-null object
geoId 25726 non-null object
countryterritoryCode 25662 non-null object
popData2019 25564 non-null float64
continentExp 25726 non-null object
date 25726 non-null datetime64[ns]
dtypes: datetime64ns, float64(1), int32(5), object(5)
memory usage: 1.9+ MB