Real-Time AI-Powered Market Sentiment Dashboard (Dash + Transformers)

Interactive dashboard that pulls live news or Reddit posts, runs NLP preprocessing and sentiment analysis (Hugging Face distilbert-base-uncased-finetuned-sst-2-english and TextBlob), and optionally generates short market summaries with GPT‑2. Built with Dash + Plotly.

Source file in this repo: Real-Time Market Sentiment Dashboard - M_Analyse_Sources_P.py (you can rename it to app.py).

Overview

Data sources: NewsAPI (/v2/everything) or Reddit (via PRAW).
NLP:
- Tokenization, stopword removal, lemmatization (NLTK).
- Transformer-based sentiment (DistilBERT SST‑2) via transformers.pipeline.
- TextBlob polarity as a second baseline.
- Optional short generation/summaries with GPT‑2 (text-generation pipeline).
UI (Dash):
- Inputs: query, date range, language, page size, source (news/reddit).
- Outputs: two histograms (Transformers vs TextBlob), generated text area.
Runtime: CPU by default, optional GPU with PyTorch CUDA.

Project Structure

Real-Time-AI-Powered-Market-Sentiment-Dashboard/
├─ Real-Time Market Sentiment Dashboard - M_Analyse_Sources_P.py   # main app
└─ README.md

Requirements

Python ≥ 3.9
Packages: dash, plotly, requests, nltk, torch, transformers, textblob, python-dotenv, praw

Install (example):

python -m venv .venv
# Windows
.venv\Scripts\activate
# macOS/Linux
source .venv/bin/activate

pip install dash plotly requests nltk torch transformers textblob python-dotenv praw

First run NLTK downloads may fetch corpora at runtime. You can pre-download to avoid startup cost.

import nltk
nltk.download("punkt")
nltk.download("stopwords")
nltk.download("wordnet")
nltk.download("averaged_perceptron_tagger")
nltk.download("maxent_ne_chunker")
nltk.download("words")
nltk.download("omw-1.4")

Environment Variables (.env)

Create a .env file in the project root:

NEWS_API_KEY=your_newsapi_key
REDDIT_CLIENT_ID=your_reddit_app_client_id
REDDIT_CLIENT_SECRET=your_reddit_app_client_secret

Notes:

NewsAPI: create an API key at newsapi.org and respect their rate limits/terms.
Reddit: create a Reddit application (script) to obtain credentials (PRAW). The app uses the subreddit name typed into the query box when Source=Reddit.

Run

On Windows (filename has spaces):

python "Real-Time Market Sentiment Dashboard - M_Analyse_Sources_P.py"

Or rename to app.py and run:

python app.py

The server binds to port 8051 by default:

Open http://127.0.0.1:8051/ in your browser.

How It Works

Fetch content
- News: everything?q=<query>&from=<date>&to=<date>&language=<lang>&apiKey=$NEWS_API_KEY.
- Reddit: fetch top/hot posts from the given subreddit; use title + selftext as content.
Preprocess
- Lowercase, strip punctuation, keep alphabetic tokens, remove stopwords, lemmatize.
Sentiment
- Transformers pipeline sentiment-analysis (DistilBERT SST‑2).
- TextBlob polarity in parallel.
Visualization
- Two histograms (Transformers labels, TextBlob polarity).
Optional generation
- GPT‑2 text-generation pipeline processes the concatenated article text to produce a short market write‑up.

GPU (Optional)

To use a GPU with pipelines, construct pipelines with a device index:

from transformers import pipeline

sentiment_model = pipeline(
    "sentiment-analysis",
    model="distilbert-base-uncased-finetuned-sst-2-english",
    device=0  # CUDA:0; use -1 for CPU
)

text_generator = pipeline(
    "text-generation",
    model="gpt2",
    device=0  # or -1 for CPU
)

Avoid calling .to(...) on a pipeline object. If you need manual control, move the underlying model to CUDA and pass it into a new pipeline.

Known Issues (from the reference script)

Overwrites Reddit results
After selecting source='reddit', the script calls get_news_articles(...) again unconditionally, replacing Reddit data. Remove the second call so the branch result is preserved.

if source == "news":
    articles = get_news_articles(...)
elif source == "reddit":
    posts = get_reddit_posts(query, limit=int(page_size))
    articles = [{"content": p["text"], "title": p["title"]} for p in posts]
# DO NOT call get_news_articles again here

Status code check inverted
get_news_articles prints an error when status_code == 200. It should return articles directly on 200 and only print on errors.

resp = requests.get(url, params=payload)
if resp.status_code == 200:
    return resp.json().get("articles", [])
else:
    print(f"Couldn't retrieve articles. HTTP {resp.status_code}")
    return [{"content": "", "title": "Error"}]

Generated text variable
The code builds generated_texts (list) but then calls process_generated_text(generated_text) with an undefined variable. Either process each chunk and join, or generate once.
```
generated = text_generator(input_text, max_length=512, num_return_sequences=1)
processed_text = process_generated_text(generated)
```

Null content fields
Some NewsAPI articles have content=None. Guard concatenation:

article_texts = [a.get("content") or a.get("description") or "" for a in articles]
concatenated_text = " ".join(article_texts)

API key leakage
The script does print(API_KEY). Remove any logging of secrets.
NLTK downloads at startup
Repeated nltk.download(...) calls slow startup. Consider moving downloads to setup or guard them.

Minimal Patch Example

def get_news_articles(q, from_param, to, language="en", page_size=10):
    url = "https://newsapi.org/v2/everything"
    payload = {"q": q, "from": from_param, "to": to, "language": language, "apiKey": API_KEY, "pageSize": page_size}
    r = requests.get(url, params=payload)
    if r.status_code == 200:
        return r.json().get("articles", [])
    print(f"NewsAPI error: HTTP {r.status_code} — {r.text[:200]}")
    return [{"content": "", "title": "Error"}]

Acknowledgements

Hugging Face Transformers and TextBlob for NLP
Dash and Plotly for the web UI
NewsAPI and Reddit (PRAW) for data access

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
Real-Time Market Sentiment Dashboard - M_Analyse_Sources_P.py		Real-Time Market Sentiment Dashboard - M_Analyse_Sources_P.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Real-Time AI-Powered Market Sentiment Dashboard (Dash + Transformers)

Overview

Project Structure

Requirements

Environment Variables (.env)

Run

How It Works

GPU (Optional)

Known Issues (from the reference script)

Minimal Patch Example

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Real-Time AI-Powered Market Sentiment Dashboard (Dash + Transformers)

Overview

Project Structure

Requirements

Environment Variables (.env)

Run

How It Works

GPU (Optional)

Known Issues (from the reference script)

Minimal Patch Example

Acknowledgements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages