🌀

Thoughts

Overview

Output Types:
  • 📝: Text
  • 🖼️: Image
  • </> : Code
  • 🛠️ : Software tool use
  • 🎥 : Video
  • 🎵: Music
  • 📦: 3D
  • 🤖: Robot state
Model types:
  • 📝 → 📝 : LLMs
  • 📝+ 🖼️ → 📝 : Multimodal LLMs
  • 📝+ 🖼️+ 🤖 → 📝 : Multimodal LLMs for Robotics
  • 📝 → </> : Text to Code
  • 📝 → 🛠️ : Text to Software tool use
  • 📝 → 🖼️ : Text to Image
  • 📝 → 🎥 : Text to Video
  • 📝 → 🎵 : Text to Music
  • 🖼️ → 📦 : Image to 3D
  • 📝 → 📦 : Text to 3D

Video Introduction

AI has completely changed the way I work, both in running a business and in creating content.
AI is permeating every software I use, and soon, it will be a part of every software we all use.
Using AI will soon be as commonplace as using email or Wi-Fi. It will become so universally beneficial that it would seem absurd not to use it.
But, much like the internet, you don’t need to grasp its technical workings to profit from it. Diving deep into its mechanics might be less fruitful than learning the opportunities it unlocks. For instance:
  • Excelling at building websites from any location worldwide.
  • Amassing a substantial social media following.
  • Mastering the art of advertising on numerous platforms simultaneously.
These are the doors the internet opened, and for the adventurous, it meant being at the forefront of many trends.
Now, what many label as "AI" is poised similarly.
It's prime territory for early adopters—those aiming to expand their businesses, enhance their content, conduct thorough research, hone their artistic skills, improve their writing, refine their speaking, or even adopt a new persona. The potentialities are endless.
Consider this a comprehensive guide—a roadmap for you. Whether you're at the onset of your AI journey or already immersed in projects, I urge you to listen closely.
I will detail all the significant trends and emerging capabilities in this video. From my observations, those reaping the most benefits from AI are the ones blending these new skills in unique ways.
There's always a scope to refine the workflow, and this video promises to do just that. It aims to provide a holistic overview of AI's advancements this year, enabling you to navigate its intricacies adeptly.
Let's get started.

Summary Video

I went through this 160 page report on AI this year that disected

New Abilities

  • Comparing the different chatbots, what each of them offer in the present
    • Speed of adoption (87)
    • Key Vocab around Chatbots - RLHF
    • How they progressed over the past year.
    • How the competition is steepening and the key players are no longer open (Besides Meta)
    • Talk about the major companies involved in the AI movement
      • OpenAI
      • Anthropic
      • Google
      • Meta
      • Pi
  • How these LLM’s are likely to improve, and what this would mean from a practical standpoint
    • Context Windows don’t do well in the middle right now (24)
    • They will become more indistinguishable… so they will build in watermarks (Talk about previous failures)
 
Robotics - How Technology in the real world will change (45ish)
  • PaLM-E: a foundation model for robotics
  • From vision-language models to low-level robot control: RT-2
  • From vision-language models to low-level robot control: RoboCat
  • An autonomous system that races drones faster than human world champions
  • The emergence of maps in the memories of blind navigation agents
  • CICERO masters natural language to beat humans at Diplomacy
 
Creative Tools (51)
  • The text-to-video generation race continues
  • Instruction based editing assistants for text-image generation
  • Welcome 3D Gaussian Splatting
  • NeRFs meet GenAI
  • Zero-shot metric depth is here
  • Segment Anything: a promptable segmentation model with zero-shot generalisation
  • DINOv2: the new default computer vision backbone
  • Another year of progress in music generation
 
Other Science
  • More accurate weather predictions, in the now(casts) and the longer ranges (58
  • Diffusion models design diverse functional proteins from simple molecular specifications (60)
  • Learning the rules of protein structure at evolutionary-scale with language models
  • Predicting the outcome of perturbing multiple genes without a cell-based experiment
  • Pathogenic or not? Predicting the outcome of all single-amino acid changes
  • Google’s Med-PaLM 2 language model is an expert according to the USMLE
  • Real world-inspired clinical system design for automated medical image analysis
 
Industry
  • NVIDIA is murdering, absolutely crushing and quickly becoming one of the most powerful companies in the world
  • Text to Speech blew up in popularity
  • Synthesia Video growth - and can talk about innovations in this AI avatar space (86)
  • Chegg got absolutely destroyed by AI
  • ChatGPT being adopting for coding - Chart showing the comparison to github and stack overflow
  • Github Co-Pilot
  • Character AI, Copyright issues, and emotional dependence on these chatbots
  • Midjourney growth
  • Apps are struggling with retention (95)
  • Copyright protection in AI generated content (98)
  • Hugging Face - Open source AI is on a tear at a time when incumbents push for closed source AI (101)
  • Insane bubble might be forming - On the day of NVIDIA’s $50M investment announcement into Recursion Pharmaceuticals, the latter’s share price surged 80% to create an additional $1B of market value. Such a reaction demonstrates the AI fever. (104)
  • Google Deepmind version 2 forming
  • Amazon Investment in Anthropic
  • Waymo and Cruise have been granted permission to launch paid 24/7 autonomous driving services in San Francisco. Previously paid rides were only possible when a driver was in the vehicle for monitoring.
  • Countries investing in AI unicorns by country (113)
  • 24% of all corporate VC investments went into AI companies in 2023 (116)
  • 2023 sees a massive acceleration in GenAI funding (117)
 
Abilities
  • Show different content people can make (Content Creation)