JHU AgenticAI Project 1 - DualLens Analytics

Overview

DualLens Analytics is a comprehensive investment analysis tool that combines quantitative financial metrics with qualitative insights from organizational AI initiatives. By applying a dual-lens approach using Retrieval-Augmented Generation (RAG), the project merges financial growth data with strategic insights from organizational reports to provide a holistic view of organizational potential.

Problem Statement

Traditional investment analysis often focuses solely on financial metrics (e.g., stock growth, revenue, market cap), missing the qualitative dimension of how prepared a company is for the future. On the other hand, qualitative documents like strategy PDFs contain valuable insights about innovation and AI initiatives, but they are difficult to structure, query, and integrate with numeric financial data.

Core Challenges Addressed

Fragmented Data Sources: Financial data (stock prices) and strategic insights (PDFs) exist in silos
Limited Analytical Scope: Manual analysis of growth trends and PDF reports is time-consuming and error-prone
Decisional Blind Spots: Without integrating both quantitative (growth trends) and qualitative (AI initiatives) signals, investors may miss out on high-potential organizations

Features

Financial Data Analysis: Automated collection and visualization of stock market data for multiple companies (GOOGL, MSFT, IBM, NVDA, AMZN)
PDF Document Processing: Extraction and chunking of AI initiative documents from company reports
Vector Store Integration: ChromaDB vector store for semantic search and retrieval
RAG Pipeline: Retrieval-Augmented Generation system for querying company AI initiatives
Unified Analysis: Combined financial metrics and AI initiative insights for comprehensive investment decisions

Project Structure

.
├── JHU AgenticAI Project 1 Learners Notebook (1).ipynb  # Main notebook
├── AMZN.pdf                                               # Amazon AI initiatives document
├── GOOGL.pdf                                              # Google AI initiatives document
├── IBM.pdf                                                # IBM AI initiatives document
├── MSFT.pdf                                               # Microsoft AI initiatives document
├── NVDA.pdf                                               # NVIDIA AI initiatives document
└── README.md                                              # This file

Technologies Used

Python: Core programming language
Jupyter Notebook: Interactive development environment
yfinance: Stock market data collection
LangChain: RAG pipeline and document processing
OpenAI: LLM and embeddings (GPT-4o-mini, text-embedding-ada-002)
ChromaDB: Vector database for document storage and retrieval
Pandas: Data manipulation and analysis
Matplotlib: Data visualization

Key Components

1. Financial Data Collection

Automated stock price history retrieval
Financial metrics extraction (Market Cap, P/E Ratio, Dividend Yield, Beta, Total Revenue)
Data visualization and comparison across companies

2. Document Processing

PDF text extraction from company AI initiative reports
Text chunking using RecursiveCharacterTextSplitter
Document vectorization using OpenAI embeddings

3. RAG System

Vector store creation with ChromaDB
Semantic search and retrieval
LLM-powered question answering based on retrieved context

Setup Instructions

Install Dependencies:

pip install langchain_openai
pip install langchain-text-splitters
pip install langchain-community
pip install chromadb
pip install yfinance
pip install pandas
pip install matplotlib
pip install PyPDF2

Configure API Keys:
- Set up your OpenAI API key in the notebook
- The notebook includes configuration for API key management
Upload PDF Documents:
- Upload the 5 company PDF files (AMZN, GOOGL, IBM, MSFT, NVDA)
- The notebook will process and extract text from these documents
Run the Notebook:
- Execute cells sequentially
- The notebook will:
  - Fetch financial data for all companies
  - Process and chunk PDF documents
  - Create vector embeddings
  - Build the RAG system
  - Enable querying of company AI initiatives

Usage

Financial Analysis: Run the financial data collection cells to view stock trends and metrics
Document Processing: Execute the PDF processing cells to extract and chunk company documents
Query AI Initiatives: Use the RAG function to ask questions about company AI initiatives:
```
response = RAG("What are Google's main AI initiatives?")
```

Project Status

✅ All placeholders resolved
✅ Financial data collection implemented
✅ PDF processing pipeline complete
✅ RAG system functional
✅ Vector store integration working

Notes

The notebook includes resolved code with all placeholders filled
API key configuration is required before running
PDF documents should be uploaded before document processing
The RAG system requires an active OpenAI API key

License

This project is part of the JHU AgenticAI course curriculum.

Author

Taceyes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

JHU AgenticAI Project 1 - DualLens Analytics

Overview

Problem Statement

Core Challenges Addressed

Features

Project Structure

Technologies Used

Key Components

1. Financial Data Collection

2. Document Processing

3. RAG System

Setup Instructions

Usage

Project Status

Notes

License

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.gitignore		.gitignore
AMZN.pdf		AMZN.pdf
GOOGL.pdf		GOOGL.pdf
IBM.pdf		IBM.pdf
JHU AgenticAI Project 1 Learners Notebook (1).ipynb		JHU AgenticAI Project 1 Learners Notebook (1).ipynb
MSFT.pdf		MSFT.pdf
NVDA.pdf		NVDA.pdf
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

JHU AgenticAI Project 1 - DualLens Analytics

Overview

Problem Statement

Core Challenges Addressed

Features

Project Structure

Technologies Used

Key Components

1. Financial Data Collection

2. Document Processing

3. RAG System

Setup Instructions

Usage

Project Status

Notes

License

Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages