🎀 Talk Summary: No-Code RAG Chatbot with PHP, LLMs & Elasticsearch

Speaker: Ashish Diwali (Senior Developer Advocate, Elastic)


πŸ”‘ Introduction

  • Topic: Integrating Generative AI (LLMs) with PHP.
  • Goal: Show how to build chat assistants, semantic search, and vector search without heavy ML expertise.
  • Demo focus: Using Elasticsearch + PHP + LLM (LLaMA 3.1).

🧩 Core Concepts

1. Prompt Engineering

  • LLMs generate responses based on prompts β†’ predicting next words.
  • Techniques:
    • Zero-shot inference β†’ direct classification or tagging.
    • One-shot inference β†’ provide one example in the prompt.
    • Few-shot inference β†’ multiple examples β†’ useful for structured outputs (SQL, JSON, XML).
  • Iteration + context = In-context learning (ICL).

2. LLM Limitations

  • ❌ Hallucinations (wrong answers).
  • ❌ Complex to build/train from scratch.
  • ❌ No real-time / private data access.
  • ❌ Privacy & security concerns (especially in banking, public sector).

3. RAG (Retrieval-Augmented Generation)

  • Solution to limitations.
  • Workflow:
    1. User query β†’ hits database/vector DB (e.g., Elasticsearch).
    2. Retrieve top 5–10 relevant docs.
    3. Pass as context window β†’ LLM generates accurate answer.
  • Benefits:
    • Grounded responses.
    • Works with private data.
    • Avoids retraining large models.

  • Semantic Search: Understands meaning, not just keywords.
    • Example: β€œbest city” ↔ β€œbeautiful city.”
  • Vector Search: Text, images, and audio converted into embeddings (arrays of floats).
    • Enables image search, recommendation systems, music search (via humming).
  • Similarity algorithms: cosine similarity, dot product, nearest neighbors.

πŸ› οΈ Tools & Demo

Elephant Library (PHP)

  • Open-source PHP library for GenAI apps.
  • Supports:
    • LLMs: OpenAI, Mistral, Anthropic, LLaMA.
    • Vector DBs: Elasticsearch, Pinecone, Chroma, etc.
    • Features: document chunking, embedding generation, semantic retrieval, Q&A (RAG).

Demo Flow

  1. Ingestion:

    • Chunk PDF into smaller pieces (800 chars).
    • Generate embeddings with LLaMA.
    • Store text + vectors in Elasticsearch.
  2. Querying:

    • User question β†’ hits Elasticsearch.
    • Retrieve top 10 docs.
    • Send docs + query β†’ LLaMA β†’ response.
  3. Examples:

    • β€œWho won the Nobel Prize in Physics 2024?” β†’ Retrieved correct answer from PDF context.
    • β€œHow do brain neural networks work?” β†’ Summarized based on provided docs.
    • β€œWho won ICC Championship 2025?” β†’ No irrelevant hallucination (kept within context).

🎯 Key Takeaways

  • Don’t train your own LLM β†’ use RAG + search to build assistants on private data.
  • Elasticsearch is a powerful vector DB for semantic + hybrid search.
  • PHP + Elephant makes building RAG chatbots accessible for web developers.
  • RAG powers most modern chat assistants today.