E-COMMERCE SALES DATA ANALYSIS

 

INTRODUCTION



This project focuses on analyzing an e-commerce dataset to understand sales trends , customer behaviour and product performance. The analysis aims to extract meaningful insights that can help businesses make data-driven decisions. Tools like Python , Pandas , Matplotlib and Seaborn are used for data processing and visualization.
 

OBJECTIVES



The main objective of this project is :
- To analyze sales and revenue trends.
- To identify top-performing products.
- To understand customer and country-wise performance.
- To generate actionable business insights.



 

DATASET DESCRIPTION



The dataset contains transactional data of an online retail store . It includes details such as Invoice Number , Product Description , Quantity , Invoice Date , Unit Price , Customer ID and Country.
 

TOOLS & TECHNOLOGIES USED



- Python
- Pandas
- Matplotlib
- Seaborn
- Jupyter Notebook / Google Colab
 

FEATURE ENGINEERING



A new column 'Revenue' was created using: Revenue = Quantity * Unit Price
 

EXPLORATORY DATA ANALYSIS (EDA)



Exploratory Data Analysis was performed to understand patterns and trends in the dataset using various visualization such as bar charts , line plots and heatmaps.
 

IMPORT & UPLOAD THE ZIP FILE

In [1]:
from google.colab import files
uploaded = files.upload()
Upload widget is only available when the cell has been executed in the current browser session. Please rerun this cell to enable.
Saving OnlineRetail.csv.zip to OnlineRetail.csv.zip
 

UNZIP THE FILE

In [3]:
import zipfile
with zipfile.ZipFile("OnlineRetail.csv.zip",'r') as zip_ref: zip_ref.extractall()
 

EXTRACTED FILE NAME CHECKING

In [4]:
import os
os.listdir()
Out[4]:
['.config', 'OnlineRetail.csv', 'OnlineRetail.csv.zip', 'sample_data']
 

INSTALL LIBRARIES

In [5]:
pip install pandas matplotlib seaborn jupyter plotly streamlit
Requirement already satisfied: pandas in /usr/local/lib/python3.12/dist-packages (2.2.2) Requirement already satisfied: matplotlib in /usr/local/lib/python3.12/dist-packages (3.10.0) Requirement already satisfied: seaborn in /usr/local/lib/python3.12/dist-packages (0.13.2) Collecting jupyter Downloading jupyter-1.1.1-py2.py3-none-any.whl.metadata (2.0 kB) Requirement already satisfied: plotly in /usr/local/lib/python3.12/dist-packages (5.24.1) Collecting streamlit Downloading streamlit-1.56.0-py3-none-any.whl.metadata (9.8 kB) Requirement already satisfied: numpy>=1.26.0 in /usr/local/lib/python3.12/dist-packages (from pandas) (2.0.2) Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.12/dist-packages (from pandas) (2.9.0.post0) Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.12/dist-packages (from pandas) (2025.2) Requirement already satisfied: tzdata>=2022.7 in /usr/local/lib/python3.12/dist-packages (from pandas) (2025.3) Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (1.3.3) Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (0.12.1) Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (4.62.1) Requirement already satisfied: kiwisolver>=1.3.1 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (1.5.0) Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (26.0) Requirement already satisfied: pillow>=8 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (11.3.0) Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.12/dist-packages (from matplotlib) (3.3.2) Requirement already satisfied: notebook in /usr/local/lib/python3.12/dist-packages (from jupyter) (6.5.7) Requirement already satisfied: jupyter-console in /usr/local/lib/python3.12/dist-packages (from jupyter) (6.6.3) Requirement already satisfied: nbconvert in /usr/local/lib/python3.12/dist-packages (from jupyter) (7.17.0) Requirement already satisfied: ipykernel in /usr/local/lib/python3.12/dist-packages (from jupyter) (6.17.1) Requirement already satisfied: ipywidgets in /usr/local/lib/python3.12/dist-packages (from jupyter) (7.7.1) Collecting jupyterlab (from jupyter) Downloading jupyterlab-4.5.6-py3-none-any.whl.metadata (16 kB) Requirement already satisfied: tenacity>=6.2.0 in /usr/local/lib/python3.12/dist-packages (from plotly) (9.1.4) Requirement already satisfied: altair!=5.4.0,!=5.4.1,<7,>=4.0 in /usr/local/lib/python3.12/dist-packages (from streamlit) (5.5.0) Requirement already satisfied: blinker<2,>=1.5.0 in /usr/local/lib/python3.12/dist-packages (from streamlit) (1.9.0) Requirement already satisfied: cachetools<8,>=5.5 in /usr/local/lib/python3.12/dist-packages (from streamlit) (6.2.6) Requirement already satisfied: click<9,>=7.0 in /usr/local/lib/python3.12/dist-packages (from streamlit) (8.3.1) Requirement already satisfied: gitpython!=3.1.19,<4,>=3.0.7 in /usr/local/lib/python3.12/dist-packages (from streamlit) (3.1.46) Collecting pydeck<1,>=0.8.0b4 (from streamlit) Downloading pydeck-0.9.1-py2.py3-none-any.whl.metadata (4.1 kB) Requirement already satisfied: protobuf<8,>=3.20 in /usr/local/lib/python3.12/dist-packages (from streamlit) (5.29.6) Requirement already satisfied: pyarrow>=7.0 in /usr/local/lib/python3.12/dist-packages (from streamlit) (18.1.0) Requirement already satisfied: requests<3,>=2.27 in /usr/local/lib/python3.12/dist-packages (from streamlit) (2.32.4) Requirement already satisfied: toml<2,>=0.10.1 in /usr/local/lib/python3.12/dist-packages (from streamlit) (0.10.2) Requirement already satisfied: tornado!=6.5.0,<7,>=6.0.3 in /usr/local/lib/python3.12/dist-packages (from streamlit) (6.5.1) Requirement already satisfied: typing-extensions<5,>=4.10.0 in /usr/local/lib/python3.12/dist-packages (from streamlit) (4.15.0) Requirement already satisfied: watchdog<7,>=2.1.5 in /usr/local/lib/python3.12/dist-packages (from streamlit) (6.0.0) Requirement already satisfied: jinja2 in /usr/local/lib/python3.12/dist-packages (from altair!=5.4.0,!=5.4.1,<7,>=4.0->streamlit) (3.1.6) Requirement already satisfied: jsonschema>=3.0 in /usr/local/lib/python3.12/dist-packages (from altair!=5.4.0,!=5.4.1,<7,>=4.0->streamlit) (4.26.0) Requirement already satisfied: narwhals>=1.14.2 in /usr/local/lib/python3.12/dist-packages (from altair!=5.4.0,!=5.4.1,<7,>=4.0->streamlit) (2.18.1) Requirement already satisfied: gitdb<5,>=4.0.1 in /usr/local/lib/python3.12/dist-packages (from gitpython!=3.1.19,<4,>=3.0.7->streamlit) (4.0.12) Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.12/dist-packages (from python-dateutil>=2.8.2->pandas) (1.17.0) Requirement already satisfied: charset_normalizer<4,>=2 in /usr/local/lib/python3.12/dist-packages (from requests<3,>=2.27->streamlit) (3.4.6) Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.12/dist-packages (from requests<3,>=2.27->streamlit) (3.11) Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.12/dist-packages (from requests<3,>=2.27->streamlit) (2.5.0) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.12/dist-packages (from requests<3,>=2.27->streamlit) (2026.2.25) Requirement already satisfied: debugpy>=1.0 in /usr/local/lib/python3.12/dist-packages (from ipykernel->jupyter) (1.8.15) Requirement already satisfied: ipython>=7.23.1 in /usr/local/lib/python3.12/dist-packages (from ipykernel->jupyter) (7.34.0) Requirement already satisfied: jupyter-client>=6.1.12 in /usr/local/lib/python3.12/dist-packages (from ipykernel->jupyter) (7.4.9) Requirement already satisfied: matplotlib-inline>=0.1 in /usr/local/lib/python3.12/dist-packages (from ipykernel->jupyter) (0.2.1) Requirement already satisfied: nest-asyncio in /usr/local/lib/python3.12/dist-packages (from ipykernel->jupyter) (1.6.0) Requirement already satisfied: psutil in /usr/local/lib/python3.12/dist-packages (from ipykernel->jupyter) (5.9.5) Requirement already satisfied: pyzmq>=17 in /usr/local/lib/python3.12/dist-packages (from ipykernel->jupyter) (26.2.1) Requirement already satisfied: traitlets>=5.1.0 in /usr/local/lib/python3.12/dist-packages (from ipykernel->jupyter) (5.7.1) Requirement already satisfied: ipython-genutils~=0.2.0 in /usr/local/lib/python3.12/dist-packages (from ipywidgets->jupyter) (0.2.0) Requirement already satisfied: widgetsnbextension~=3.6.0 in /usr/local/lib/python3.12/dist-packages (from ipywidgets->jupyter) (3.6.10) Requirement already satisfied: jupyterlab-widgets>=1.0.0 in /usr/local/lib/python3.12/dist-packages (from ipywidgets->jupyter) (3.0.16) Requirement already satisfied: jupyter-core!=5.0.*,>=4.12 in /usr/local/lib/python3.12/dist-packages (from jupyter-console->jupyter) (5.9.1) Requirement already satisfied: prompt-toolkit>=3.0.30 in /usr/local/lib/python3.12/dist-packages (from jupyter-console->jupyter) (3.0.52) Requirement already satisfied: pygments in /usr/local/lib/python3.12/dist-packages (from jupyter-console->jupyter) (2.20.0) Collecting async-lru>=1.0.0 (from jupyterlab->jupyter) Downloading async_lru-2.3.0-py3-none-any.whl.metadata (7.6 kB) Requirement already satisfied: httpx<1,>=0.25.0 in /usr/local/lib/python3.12/dist-packages (from jupyterlab->jupyter) (0.28.1) Collecting jupyter-lsp>=2.0.0 (from jupyterlab->jupyter) Downloading jupyter_lsp-2.3.1-py3-none-any.whl.metadata (1.8 kB) Requirement already satisfied: jupyter-server<3,>=2.4.0 in /usr/local/lib/python3.12/dist-packages (from jupyterlab->jupyter) (2.14.0) Collecting jupyterlab-server<3,>=2.28.0 (from jupyterlab->jupyter) Downloading jupyterlab_server-2.28.0-py3-none-any.whl.metadata (5.9 kB) Requirement already satisfied: notebook-shim>=0.2 in /usr/local/lib/python3.12/dist-packages (from jupyterlab->jupyter) (0.2.4) Requirement already satisfied: setuptools>=41.1.0 in /usr/local/lib/python3.12/dist-packages (from jupyterlab->jupyter) (75.2.0) Requirement already satisfied: beautifulsoup4 in /usr/local/lib/python3.12/dist-packages (from nbconvert->jupyter) (4.13.5) Requirement already satisfied: bleach!=5.0.0 in /usr/local/lib/python3.12/dist-packages (from bleach[css]!=5.0.0->nbconvert->jupyter) (6.3.0) Requirement already satisfied: defusedxml in /usr/local/lib/python3.12/dist-packages (from nbconvert->jupyter) (0.7.1) Requirement already satisfied: jupyterlab-pygments in /usr/local/lib/python3.12/dist-packages (from nbconvert->jupyter) (0.3.0) Requirement already satisfied: markupsafe>=2.0 in /usr/local/lib/python3.12/dist-packages (from nbconvert->jupyter) (3.0.3) Requirement already satisfied: mistune<4,>=2.0.3 in /usr/local/lib/python3.12/dist-packages (from nbconvert->jupyter) (3.2.0) Requirement already satisfied: nbclient>=0.5.0 in /usr/local/lib/python3.12/dist-packages (from nbconvert->jupyter) (0.10.4) Requirement already satisfied: nbformat>=5.7 in /usr/local/lib/python3.12/dist-packages (from nbconvert->jupyter) (5.10.4) Requirement already satisfied: pandocfilters>=1.4.1 in /usr/local/lib/python3.12/dist-packages (from nbconvert->jupyter) (1.5.1) Requirement already satisfied: argon2-cffi in /usr/local/lib/python3.12/dist-packages (from notebook->jupyter) (25.1.0) Requirement already satisfied: Send2Trash>=1.8.0 in /usr/local/lib/python3.12/dist-packages (from notebook->jupyter) (2.1.0) Requirement already satisfied: terminado>=0.8.3 in /usr/local/lib/python3.12/dist-packages (from notebook->jupyter) (0.18.1) Requirement already satisfied: prometheus-client in /usr/local/lib/python3.12/dist-packages (from notebook->jupyter) (0.24.1) Requirement already satisfied: nbclassic>=0.4.7 in /usr/local/lib/python3.12/dist-packages (from notebook->jupyter) (1.3.3) Requirement already satisfied: webencodings in /usr/local/lib/python3.12/dist-packages (from bleach!=5.0.0->bleach[css]!=5.0.0->nbconvert->jupyter) (0.5.1) Requirement already satisfied: tinycss2<1.5,>=1.1.0 in /usr/local/lib/python3.12/dist-packages (from bleach[css]!=5.0.0->nbconvert->jupyter) (1.4.0) Requirement already satisfied: smmap<6,>=3.0.1 in /usr/local/lib/python3.12/dist-packages (from gitdb<5,>=4.0.1->gitpython!=3.1.19,<4,>=3.0.7->streamlit) (5.0.3) Requirement already satisfied: anyio in /usr/local/lib/python3.12/dist-packages (from httpx<1,>=0.25.0->jupyterlab->jupyter) (4.13.0) Requirement already satisfied: httpcore==1.* in /usr/local/lib/python3.12/dist-packages (from httpx<1,>=0.25.0->jupyterlab->jupyter) (1.0.9) Requirement already satisfied: h11>=0.16 in /usr/local/lib/python3.12/dist-packages (from httpcore==1.*->httpx<1,>=0.25.0->jupyterlab->jupyter) (0.16.0) Collecting jedi>=0.16 (from ipython>=7.23.1->ipykernel->jupyter) Downloading jedi-0.19.2-py2.py3-none-any.whl.metadata (22 kB) Requirement already satisfied: decorator in /usr/local/lib/python3.12/dist-packages (from ipython>=7.23.1->ipykernel->jupyter) (4.4.2) Requirement already satisfied: pickleshare in /usr/local/lib/python3.12/dist-packages (from ipython>=7.23.1->ipykernel->jupyter) (0.7.5) Requirement already satisfied: backcall in /usr/local/lib/python3.12/dist-packages (from ipython>=7.23.1->ipykernel->jupyter) (0.2.0) Requirement already satisfied: pexpect>4.3 in /usr/local/lib/python3.12/dist-packages (from ipython>=7.23.1->ipykernel->jupyter) (4.9.0) Requirement already satisfied: attrs>=22.2.0 in /usr/local/lib/python3.12/dist-packages (from jsonschema>=3.0->altair!=5.4.0,!=5.4.1,<7,>=4.0->streamlit) (26.1.0) Requirement already satisfied: jsonschema-specifications>=2023.03.6 in /usr/local/lib/python3.12/dist-packages (from jsonschema>=3.0->altair!=5.4.0,!=5.4.1,<7,>=4.0->streamlit) (2025.9.1) Requirement already satisfied: referencing>=0.28.4 in /usr/local/lib/python3.12/dist-packages (from jsonschema>=3.0->altair!=5.4.0,!=5.4.1,<7,>=4.0->streamlit) (0.37.0) Requirement already satisfied: rpds-py>=0.25.0 in /usr/local/lib/python3.12/dist-packages (from jsonschema>=3.0->altair!=5.4.0,!=5.4.1,<7,>=4.0->streamlit) (0.30.0) Requirement already satisfied: entrypoints in /usr/local/lib/python3.12/dist-packages (from jupyter-client>=6.1.12->ipykernel->jupyter) (0.4) Requirement already satisfied: platformdirs>=2.5 in /usr/local/lib/python3.12/dist-packages (from jupyter-core!=5.0.*,>=4.12->jupyter-console->jupyter) (4.9.4) Requirement already satisfied: jupyter-events>=0.9.0 in /usr/local/lib/python3.12/dist-packages (from jupyter-server<3,>=2.4.0->jupyterlab->jupyter) (0.12.0) Requirement already satisfied: jupyter-server-terminals>=0.4.4 in /usr/local/lib/python3.12/dist-packages (from jupyter-server<3,>=2.4.0->jupyterlab->jupyter) (0.5.4) Requirement already satisfied: overrides>=5.0 in /usr/local/lib/python3.12/dist-packages (from jupyter-server<3,>=2.4.0->jupyterlab->jupyter) (7.7.0) Requirement already satisfied: websocket-client>=1.7 in /usr/local/lib/python3.12/dist-packages (from jupyter-server<3,>=2.4.0->jupyterlab->jupyter) (1.9.0) Requirement already satisfied: argon2-cffi-bindings in /usr/local/lib/python3.12/dist-packages (from argon2-cffi->notebook->jupyter) (25.1.0) Requirement already satisfied: babel>=2.10 in /usr/local/lib/python3.12/dist-packages (from jupyterlab-server<3,>=2.28.0->jupyterlab->jupyter) (2.18.0) Collecting json5>=0.9.0 (from jupyterlab-server<3,>=2.28.0->jupyterlab->jupyter) Downloading json5-0.14.0-py3-none-any.whl.metadata (36 kB) Requirement already satisfied: fastjsonschema>=2.15 in /usr/local/lib/python3.12/dist-packages (from nbformat>=5.7->nbconvert->jupyter) (2.21.2) Requirement already satisfied: wcwidth in /usr/local/lib/python3.12/dist-packages (from prompt-toolkit>=3.0.30->jupyter-console->jupyter) (0.6.0) Requirement already satisfied: ptyprocess in /usr/local/lib/python3.12/dist-packages (from terminado>=0.8.3->notebook->jupyter) (0.7.0) Requirement already satisfied: soupsieve>1.2 in /usr/local/lib/python3.12/dist-packages (from beautifulsoup4->nbconvert->jupyter) (2.8.3) Requirement already satisfied: parso<0.9.0,>=0.8.4 in /usr/local/lib/python3.12/dist-packages (from jedi>=0.16->ipython>=7.23.1->ipykernel->jupyter) (0.8.6) Requirement already satisfied: python-json-logger>=2.0.4 in /usr/local/lib/python3.12/dist-packages (from jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter) (4.1.0) Requirement already satisfied: pyyaml>=5.3 in /usr/local/lib/python3.12/dist-packages (from jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter) (6.0.3) Requirement already satisfied: rfc3339-validator in /usr/local/lib/python3.12/dist-packages (from jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter) (0.1.4) Requirement already satisfied: rfc3986-validator>=0.1.1 in /usr/local/lib/python3.12/dist-packages (from jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter) (0.1.1) Requirement already satisfied: cffi>=1.0.1 in /usr/local/lib/python3.12/dist-packages (from argon2-cffi-bindings->argon2-cffi->notebook->jupyter) (2.0.0) Requirement already satisfied: pycparser in /usr/local/lib/python3.12/dist-packages (from cffi>=1.0.1->argon2-cffi-bindings->argon2-cffi->notebook->jupyter) (3.0) Requirement already satisfied: fqdn in /usr/local/lib/python3.12/dist-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter) (1.5.1) Requirement already satisfied: isoduration in /usr/local/lib/python3.12/dist-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter) (20.11.0) Requirement already satisfied: jsonpointer>1.13 in /usr/local/lib/python3.12/dist-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter) (3.1.1) Requirement already satisfied: rfc3987-syntax>=1.1.0 in /usr/local/lib/python3.12/dist-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter) (1.1.0) Requirement already satisfied: uri-template in /usr/local/lib/python3.12/dist-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter) (1.3.0) Requirement already satisfied: webcolors>=24.6.0 in /usr/local/lib/python3.12/dist-packages (from jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter) (25.10.0) Requirement already satisfied: lark>=1.2.2 in /usr/local/lib/python3.12/dist-packages (from rfc3987-syntax>=1.1.0->jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter) (1.3.1) Requirement already satisfied: arrow>=0.15.0 in /usr/local/lib/python3.12/dist-packages (from isoduration->jsonschema[format-nongpl]>=4.18.0->jupyter-events>=0.9.0->jupyter-server<3,>=2.4.0->jupyterlab->jupyter) (1.4.0) Downloading jupyter-1.1.1-py2.py3-none-any.whl (2.7 kB) Downloading streamlit-1.56.0-py3-none-any.whl (9.1 MB)  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.1/9.1 MB 90.9 MB/s eta 0:00:00 [?25hDownloading pydeck-0.9.1-py2.py3-none-any.whl (6.9 MB)  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.9/6.9 MB 91.5 MB/s eta 0:00:00 [?25hDownloading jupyterlab-4.5.6-py3-none-any.whl (12.4 MB)  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12.4/12.4 MB 85.1 MB/s eta 0:00:00 [?25hDownloading async_lru-2.3.0-py3-none-any.whl (8.4 kB) Downloading jupyter_lsp-2.3.1-py3-none-any.whl (77 kB)  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 77.5/77.5 kB 3.2 MB/s eta 0:00:00 [?25hDownloading jupyterlab_server-2.28.0-py3-none-any.whl (59 kB)  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 59.8/59.8 kB 2.8 MB/s eta 0:00:00 [?25hDownloading jedi-0.19.2-py2.py3-none-any.whl (1.6 MB)  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 49.6 MB/s eta 0:00:00 [?25hDownloading json5-0.14.0-py3-none-any.whl (36 kB) Installing collected packages: json5, jedi, async-lru, pydeck, streamlit, jupyterlab-server, jupyter-lsp, jupyterlab, jupyter Successfully installed async-lru-2.3.0 jedi-0.19.2 json5-0.14.0 jupyter-1.1.1 jupyter-lsp-2.3.1 jupyterlab-4.5.6 jupyterlab-server-2.28.0 pydeck-0.9.1 streamlit-1.56.0
 

LOAD DATASET

In [6]:
import pandas as pd
df = pd.read_csv("OnlineRetail.csv",encoding = "ISO-8859-1")
df.head()
Out[6]:
InvoiceNo StockCode Description Quantity InvoiceDate UnitPrice CustomerID Country
0 536365 85123A WHITE HANGING HEART T-LIGHT HOLDER 6 12/1/2010 8:26 2.55 17850.0 United Kingdom
1 536365 71053 WHITE METAL LANTERN 6 12/1/2010 8:26 3.39 17850.0 United Kingdom
2 536365 84406B CREAM CUPID HEARTS COAT HANGER 8 12/1/2010 8:26 2.75 17850.0 United Kingdom
3 536365 84029G KNITTED UNION FLAG HOT WATER BOTTLE 6 12/1/2010 8:26 3.39 17850.0 United Kingdom
4 536365 84029E RED WOOLLY HOTTIE WHITE HEART. 6 12/1/2010 8:26 3.39 17850.0 United Kingdom
 

DATA CLEANING

In [7]:
# Remove duplicates
df = df.drop_duplicates()

# Remove missing values
df= df.dropna()

# Convert date column
df['InvoiceDate']=pd.to_datetime(df['InvoiceDate'])

# Remove negative or zero values
df = df[df['Quantity']> 0]
df.info()
<class 'pandas.core.frame.DataFrame'> Index: 392732 entries, 0 to 541908 Data columns (total 8 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 InvoiceNo 392732 non-null object 1 StockCode 392732 non-null object 2 Description 392732 non-null object 3 Quantity 392732 non-null int64 4 InvoiceDate 392732 non-null datetime64[ns] 5 UnitPrice 392732 non-null float64 6 CustomerID 392732 non-null float64 7 Country 392732 non-null object dtypes: datetime64[ns](1), float64(2), int64(1), object(4) memory usage: 27.0+ MB
 

CREATED REVENUE COLUMN

In [10]:
df["Revenue"] = df["Quantity"] * df["UnitPrice"]
 

DATA ANALYSIS

In [12]:
top_products = df.groupby("Description")["Revenue"].sum().sort_values(ascending=False).head(10)
print(top_products)
Description PAPER CRAFT , LITTLE BIRDIE 168469.60 REGENCY CAKESTAND 3 TIER 142264.75 WHITE HANGING HEART T-LIGHT HOLDER 100392.10 JUMBO BAG RED RETROSPOT 85040.54 MEDIUM CERAMIC TOP STORAGE JAR 81416.73 POSTAGE 77803.96 PARTY BUNTING 68785.23 ASSORTED COLOUR BIRD ORNAMENT 56413.03 Manual 53419.93 RABBIT NIGHT LIGHT 51251.24 Name: Revenue, dtype: float64
In [14]:
country_sales = df.groupby("Country")["Revenue"].sum().sort_values(ascending=False)
In [17]:
df["Month"] = df['InvoiceDate'].dt.to_period('M')
monthly_sales = df.groupby('Month')["Revenue"].sum()
 

DATA VISUALIZATION

In [18]:
import matplotlib.pyplot as plt
top_products.plot(kind='bar')
plt.title ("Top 10 Products")
plt.show()
Output
In [20]:
monthly_sales.plot()
plt.title("Monthly Revenue Trend")
plt.show()
Output
In [21]:
country_sales.head(10).plot(kind='bar')
plt.title("Top Countries By Revenue")
plt.show()
Output
In [23]:
top_customers = df.groupby('CustomerID')['Revenue'].sum().sort_values(ascending=False).head(10)

top_customers.plot(kind='bar')
plt.title("Top Customers By Revenue")
plt.show()
Output
In [25]:
df['Hour'] = df['InvoiceDate'].dt.hour
hour_sales = df.groupby('Hour')['Revenue'].sum()
hour_sales.plot(title='Sales by Hour')
plt.show()
Output
 

HEATMAP USING SEABORN

In [26]:
import seaborn as sns
corr = df[['Quantity', 'UnitPrice', 'Revenue']].corr()
sns.heatmap(corr, annot=True)
plt.show()
Output
In [27]:
df['Day'] = df['InvoiceDate'].dt.day_name()
day_sales = df.groupby('Day')['Revenue'].sum()
day_sales.plot(kind='bar')
plt.title('Sales by Day of the Week')
plt.show()
Output
In [28]:
import matplotlib.pyplot as plt
plt.scatter(df['Quantity'],df['Revenue'])
plt.title("Quantity vs Revenue")
plt.xlabel("Quantity")
plt.ylabel("Revenue")
plt.show()
Output
In [29]:
df['Revenue'].plot(kind='hist',bins=50,title = "Revenue Distribution")
Out[29]:
<Axes: title={'center': 'Revenue Distribution'}, ylabel='Frequency'>
Output
In [30]:
import seaborn as sns
sns.boxplot(x=df['Revenue'])
Out[30]:
<Axes: xlabel='Revenue'>
Output
In [31]:
country_sales.head(5).plot(kind='pie',autopct='%1.1f%%')

Out[31]:
<Axes: ylabel='Revenue'>
Output
In [32]:
monthly_sales_pct = monthly_sales.pct_change()
monthly_sales_pct.plot(title = "Monthly Growth Rate")
Out[32]:
<Axes: title={'center': 'Monthly Growth Rate'}, xlabel='Month'>
Output
In [34]:
top_product_name = df.groupby("Description")["Revenue"].sum().idxmax()
product_trend = df[df['Description'] == top_product_name].groupby('Month')['Revenue'].sum().plot()
/usr/local/lib/python3.12/dist-packages/pandas/plotting/_matplotlib/core.py:1561: UserWarning: Attempting to set identical low and high xlims makes transformation singular; automatically expanding. ax.set_xlim(left, right)
Output
 

KPI SUMMARY

In [41]:
total_revenue = df["Revenue"].sum()
total_orders = df["InvoiceNo"].nunique()
top_product = top_products.index[0]
top_country = country_sales.index[0]

print("Total Revenue:", total_revenue)
print("Total Orders:", total_orders)
print("Top Product:", top_product)
print("Top Country:", top_country)
Total Revenue: 8887208.894000003 Total Orders: 18536 Top Product: PAPER CRAFT , LITTLE BIRDIE Top Country: United Kingdom
 

INSIGHTS



* Top 10 products contribute a major share of total revenue.
* Sales are dominated by a few countries(highest from one main country).

* Revenue shows clear monthly/seasonal fluctuations.
* Peak sales occur during specific months(likely holiday season).

* A small number of customers generate most of the revenue.
* Sales vary by time(peak hours and specific days).


* Presence of negative quantities indicates product returns.
* Revenue distribution is highly skewed(few high-value transactions).

* Revenue growth is inconsistent across months.
* Many products contribute very little to overall sales.











 

CONCLUSION



This analysis provided valuable insights into sales performance , customer behaviour and product trends . The findings can help businesses optimize strategies , improve customer retention and increase revenue.