Kokoro: High-Quality Text-to-Speech(tts) on Your CPU with ONNX

This sound is generated with Kokoro tts The world of text-to-speech (TTS) has seen incredible advancements, but often these powerful models require hefty hardware like GPUs. But what if you could run a top-tier TTS model locally on your CPU? Enter **Kokoro**, a game-changing TTS model that delivers impressive results even on resource-constrained devices. Kokoro: Small but Mighty Kokoro stands out for its remarkable efficiency. With just 82 million parameters, it outperforms models several times its size, including XTTS (467M parameters) and MetaVoice (1.2B parameters). This proves that cutting-edge TTS is achievable without relying on massive models and powerful GPUs. ...

January 12, 2025 · 3 min · Nitin

BM-25 Best Matching 25

Introduction Understanding BM-25: A Powerful Algorithm for Information Retrieval Bm25 is an enhancement of the TF-IDF model that incorporates term frequency saturation and document length normalization to improve retrieval performance. When it comes to search engines and information retrieval, a vital piece of the puzzle is ranking the relevance of documents to a given query. One of the most widely used algorithms to achieve this is the BM25, Best Matching 25. BM25 is a probabilistic retrieval function that evaluates the relevance of a document to a search query, balancing simplicity and effectiveness, making it a popular choice in modern search engines and applications. ...

November 10, 2024 · 6 min · Nitin

TF-IDF

Introduction TF-IDF (Term Frequency-Inverse Document Frequency) is a statistical measure used to evaluate the importance of a word in a document relative to a collection of documents (corpus). It combines two metrics: Term Frequency (TF) and Inverse Document Frequency (IDF). The TF-IDF value increases proportionally with the number of times a word appears in the document and is offset by the frequency of the word in the corpus. Components of TF-IDF Term Frequency (TF): Measures how frequently a term appears in a document. It’s calculated as: ...

November 10, 2024 · 5 min · Nitin

Running Any GGUF Model from Hugging Face with Ollama

Introduction The latest Ollama update makes it easier than ever to run quantized GGUF models directly from Hugging Face on your local machine. With a single command, you can bypass previous limitations, no longer needing a separate model on the Ollama Model Hub. Step-by-Step Guide 1. Install Ollama Download and install Ollama on your computer. Once installed, the ollama command will be accessible from your command line interface (CLI). 2. Select a Model from Hugging Face ...

November 1, 2024 · 4 min · Nitin

SearchGPT: The Future of Search?

Introduction OpenAI has launched a groundbreaking new feature for ChatGPT: SearchGPT. This innovative tool blends the conversational nature of a chatbot with the vast resources of the internet, potentially changing the way we search for information forever. With SearchGPT, users can ask questions in natural language and receive concise answers, complete with links to relevant web sources. No more wading through pages of search results or deciphering complex search syntax – SearchGPT aims to streamline the process, making it easier and faster to find what you need. ...

November 1, 2024 · 2 min · Nitin

Unleashing the Full Potential of NotebookLM: Beyond Audio Generation to Comprehensive Research Assistance

NotebookLM: An AI-Powered Research Assistant NotebookLM is a research assistant powered by Google’s Gemini 1.5 Pro model. It’s centred around the idea of using sources and then leveraging the power of Gemini to interact with and learn from them. Here are some of the key features that make NotebookLM such a powerful tool: 1. Versatile Source Integration NotebookLM supports a variety of source formats, including: Audio files Markdown documents PDFs Google Docs and Slides Websites YouTube videos Text notes Users can upload up to 50 sources per notebook, offering great flexibility in consolidating and analyzing diverse information. ...

October 27, 2024 · 3 min · Nitin

Unveiling the Secrets Behind ChatGPT – Part 2

For part 1 refer to this: Unveiling the Secrets Behind ChatGPT – Part 1 (learncodecamp.net) Implementing a Bigram Language Model When diving into the world of natural language processing (NLP) and language modeling, starting with a simple baseline model is essential. It helps establish a foundation to build upon. One of the simplest and most intuitive models for language generation is the bigram language model. This blog post will walk you through the implementation of a bigram language model using PyTorch, explaining the key concepts, steps, and code snippets along the way. ...

June 17, 2024 · 6 min · Nitin

Unveiling the Secrets Behind ChatGPT – Part 1

Introduction Hello everyone! By now, you’ve likely heard of ChatGPT, the revolutionary AI system that has taken the world and the AI community by storm. This remarkable technology allows you to interact with an AI through text-based tasks. The Technology Behind ChatGPT: Transformers The neural network that powers ChatGPT is based on the Transformer architecture, introduced in the 2017 paper “Attention is All You Need.” GPT stands for “Generatively Pre-trained Transformer.” The Transformer architecture is a landmark development in AI that revolutionized the field, primarily in natural language processing (NLP). The Transformer architecture, initially designed for machine translation, became the backbone for numerous AI applications, including ChatGPT. ...

June 17, 2024 · 5 min · Nitin

Learning from Introduction to Deep Learning

Introduction Into to deep learning Intelligence: The ability to process information and use it for future decision-making. Artificial Intelligence (AI): Empowering computers with the ability to process information and make decisions. Machine Learning (ML): A subset of AI focused on teaching computers to learn from data. Deep Learning (DL): A subset of ML utilizing neural networks to process raw data and inform decisions. Why Deep Learning Now? The recent surge in deep learning’s capabilities can be attributed to three key factors: ...

May 4, 2024 · 7 min · Nitin

Intro to Large Language Models

The Busy Person’s Guide to Large Language Models: From Inner Workings to Future Possibilities (and Security Concerns) This post explores the fascinating world of large language models (LLMs) like ChatGPT and llama2, diving into their inner workings, potential future developments, and even the security challenges they present. It’s a summary of a talk by Andrej Karpathy, offering a comprehensive overview for anyone curious about this rapidly evolving technology. What are LLMs and How Do They Work? Imagine a massive file containing compressed knowledge from the internet – that’s essentially what an LLM is. It’s a complex neural network trained on vast amounts of text data, enabling it to predict and generate human-like text. The process involves two key stages: ...

April 23, 2024 · 4 min · Nitin