DualLens Analytics is a comprehensive investment analysis tool that combines quantitative financial metrics with qualitative insights from organizational AI initiatives. By applying a dual-lens approach using Retrieval-Augmented Generation (RAG), the project merges financial growth data with strategic insights from organizational reports to provide a holistic view of organizational potential.
Traditional investment analysis often focuses solely on financial metrics (e.g., stock growth, revenue, market cap), missing the qualitative dimension of how prepared a company is for the future. On the other hand, qualitative documents like strategy PDFs contain valuable insights about innovation and AI initiatives, but they are difficult to structure, query, and integrate with numeric financial data.
- Fragmented Data Sources: Financial data (stock prices) and strategic insights (PDFs) exist in silos
- Limited Analytical Scope: Manual analysis of growth trends and PDF reports is time-consuming and error-prone
- Decisional Blind Spots: Without integrating both quantitative (growth trends) and qualitative (AI initiatives) signals, investors may miss out on high-potential organizations
- Financial Data Analysis: Automated collection and visualization of stock market data for multiple companies (GOOGL, MSFT, IBM, NVDA, AMZN)
- PDF Document Processing: Extraction and chunking of AI initiative documents from company reports
- Vector Store Integration: ChromaDB vector store for semantic search and retrieval
- RAG Pipeline: Retrieval-Augmented Generation system for querying company AI initiatives
- Unified Analysis: Combined financial metrics and AI initiative insights for comprehensive investment decisions
.
├── JHU AgenticAI Project 1 Learners Notebook (1).ipynb # Main notebook
├── AMZN.pdf # Amazon AI initiatives document
├── GOOGL.pdf # Google AI initiatives document
├── IBM.pdf # IBM AI initiatives document
├── MSFT.pdf # Microsoft AI initiatives document
├── NVDA.pdf # NVIDIA AI initiatives document
└── README.md # This file
- Python: Core programming language
- Jupyter Notebook: Interactive development environment
- yfinance: Stock market data collection
- LangChain: RAG pipeline and document processing
- OpenAI: LLM and embeddings (GPT-4o-mini, text-embedding-ada-002)
- ChromaDB: Vector database for document storage and retrieval
- Pandas: Data manipulation and analysis
- Matplotlib: Data visualization
- Automated stock price history retrieval
- Financial metrics extraction (Market Cap, P/E Ratio, Dividend Yield, Beta, Total Revenue)
- Data visualization and comparison across companies
- PDF text extraction from company AI initiative reports
- Text chunking using RecursiveCharacterTextSplitter
- Document vectorization using OpenAI embeddings
- Vector store creation with ChromaDB
- Semantic search and retrieval
- LLM-powered question answering based on retrieved context
-
Install Dependencies:
pip install langchain_openai pip install langchain-text-splitters pip install langchain-community pip install chromadb pip install yfinance pip install pandas pip install matplotlib pip install PyPDF2
-
Configure API Keys:
- Set up your OpenAI API key in the notebook
- The notebook includes configuration for API key management
-
Upload PDF Documents:
- Upload the 5 company PDF files (AMZN, GOOGL, IBM, MSFT, NVDA)
- The notebook will process and extract text from these documents
-
Run the Notebook:
- Execute cells sequentially
- The notebook will:
- Fetch financial data for all companies
- Process and chunk PDF documents
- Create vector embeddings
- Build the RAG system
- Enable querying of company AI initiatives
- Financial Analysis: Run the financial data collection cells to view stock trends and metrics
- Document Processing: Execute the PDF processing cells to extract and chunk company documents
- Query AI Initiatives: Use the RAG function to ask questions about company AI initiatives:
response = RAG("What are Google's main AI initiatives?")
✅ All placeholders resolved
✅ Financial data collection implemented
✅ PDF processing pipeline complete
✅ RAG system functional
✅ Vector store integration working
- The notebook includes resolved code with all placeholders filled
- API key configuration is required before running
- PDF documents should be uploaded before document processing
- The RAG system requires an active OpenAI API key
This project is part of the JHU AgenticAI course curriculum.
Taceyes