🎤 Talk Summary: No-Code RAG Chatbot with PHP, LLMs & Elasticsearch

Speaker: Ashish Diwali (Senior Developer Advocate, Elastic)

🔑 Introduction

Topic: Integrating Generative AI (LLMs) with PHP.
Goal: Show how to build chat assistants, semantic search, and vector search without heavy ML expertise.
Demo focus: Using Elasticsearch + PHP + LLM (LLaMA 3.1).

LLMs generate responses based on prompts → predicting next words.
Techniques:
- Zero-shot inference → direct classification or tagging.
- One-shot inference → provide one example in the prompt.
- Few-shot inference → multiple examples → useful for structured outputs (SQL, JSON, XML).
Iteration + context = In-context learning (ICL).

Solution to limitations.
Workflow:
1. User query → hits database/vector DB (e.g., Elasticsearch).
2. Retrieve top 5–10 relevant docs.
3. Pass as context window → LLM generates accurate answer.
Benefits:
- Grounded responses.
- Works with private data.
- Avoids retraining large models.

Semantic Search: Understands meaning, not just keywords.
- Example: “best city” ↔ “beautiful city.”
Vector Search: Text, images, and audio converted into embeddings (arrays of floats).
- Enables image search, recommendation systems, music search (via humming).
Similarity algorithms: cosine similarity, dot product, nearest neighbors.

Open-source PHP library for GenAI apps.
Supports:
- LLMs: OpenAI, Mistral, Anthropic, LLaMA.
- Vector DBs: Elasticsearch, Pinecone, Chroma, etc.
- Features: document chunking, embedding generation, semantic retrieval, Q&A (RAG).

Ingestion:
- Chunk PDF into smaller pieces (800 chars).
- Generate embeddings with LLaMA.
- Store text + vectors in Elasticsearch.
Querying:
- User question → hits Elasticsearch.
- Retrieve top 10 docs.
- Send docs + query → LLaMA → response.
Examples:
- “Who won the Nobel Prize in Physics 2024?” → Retrieved correct answer from PDF context.
- “How do brain neural networks work?” → Summarized based on provided docs.
- “Who won ICC Championship 2025?” → No irrelevant hallucination (kept within context).

Don’t train your own LLM → use RAG + search to build assistants on private data.
Elasticsearch is a powerful vector DB for semantic + hybrid search.
PHP + Elephant makes building RAG chatbots accessible for web developers.
RAG powers most modern chat assistants today.