..With Transcript below
165 386
..
Check out the original stream here: https://x.com/i/broadcasts/1gqGvjeBljOGB The Robots Are Coming https://farzad.fm FREE One Year Supply of Vitamin d3+k2 and 5 AG1 Travel Packs ➡ https://drinkAG1.com/farzad Want to grow your YouTube channel? DM David Carbutt For 10% discount quote ‘Farzad’ https://x.com/DavidCarbutt_ Use my referral code for ElevenLabs AI voice generator: https://try.elevenlabs.io/erm1aiqf3h3n Podcasts available on your favorite podcast app. Get fit! Mention FARZAD when asking for a quote for a custom-built nutrition and fitness plan: https://www.teamevolveperformance.com/ Join this channel to get access to perks:

TRANSCRIPT
welcome to the gro 3 presentation the mission of xai and Gro is to understand
the
universe we want to understand the nature of the universe so we're can
figure out what's going on where are the aliens what's the meaning of
life how
does the universe end how did it start all these fundamental questions were driven by curiosity about the nature of
the universe and that's also what causes us to be a maximally truth seeking AI
even if that truth is sometimes at odds with what is politically correct in order to understand the nature of the
universe you must absolutely rigorously pursue truth or you will not understand the universe you'll be suffering from
some amount of delusion or error so that is our goal figure out what's going on
and we're very excited to present grock 3 which is we think an order of magnitude more capable than grock 2 in a
very short period of time and that's thanks to the hard work of an incredible
team I'm honored to work with such a great team team and of course we'd love to have some of the smartest humans out
there join our team so let's go hi everyone my name is eigor lead
Meet the Team and GROK's Evolution
engineering at XI I'm Jimmy Paul leading research I'm Tony working on the
reasoning Team all right El I don't do anything I just show up occasionally yeah so um like you
mentioned
Gro is the tool that we're working on Grog is our AI that we're
building here at XI and we've been working extremely hard over the last
few
months to improve Gro as much as we can so we can give all of you access to it um we think it's going to be extremely
useful do we think it's going to be interesting to talk to funny really really funny um and we're going to
explain
to you how we've improved Gro over the last few months we've made quite
a jump in capabilities yeah actually we should explain maybe also
what is why do we call it Gro so Gro is a word from a Highland novel Stranger in a Strange Land and it's used by a guy
who's raised on Mars and the word Gro is to sort of fully and profoundly
understand something that's what the word gr means fully and profoundly understand something and empathy is
important true so yeah if we charted xas progress in the last few months has only
been 17 months since we started kicking off our very first model grock one was
almost like a toy by this point only 314 billion parameters and now if you PR the
progress the time on x-axis the performance of favorite Benchmark numbers M mlu on the y axis we're
literally progressing at unprecedent speed across the whole field and then we
kick off grock 1.5 right after grock One released after November 2023 and then
grock 2 so if you look at where all the performance coming from when you have a
very correct engineering team and all the best AI at Talent there only one thing we need is a big intelligence
comes from big cluster so we can reconvert the entire progress of x a now
replacing the Benchmark on the y axis to the total amount of training flops that is how many gpus we can run at any given
time to train our large language models to compress the entire internet so after
all human all human knowledge really that's right yeah internet being part of it but it's really all human knowledge
everything yeah the whole internet fits into a USB stick at this point it's like all the human tokens very soon into the
real world we had so much trouble actually training Gru back in the days we kick up model around February and we
Technical Challenges and Data Center Expansion
thought we had a large amount of chips but turned out we can barely get AK training chips running coherently at any
given time and we had so many Cooling and power issues I think you were there
in the data center yeah it was like really sort of more like 8K tips on average at 80% efficiency more like like
6,500 effective h100s training for you know several months but you know the
100K so yeah that's right more than 100K that's right so so what's the next step
right so after gu 2 so if we want to continue accelerate we have to take the
matter into our own hands we have to solve all the coolings all the power issues and everything yeah so so in
April of last year Elon decided that really the only way for XI to succeed for XI to build the best AI out there is
to build our own data center so really we realize we have to build the data center and about four months it turned
out it took us 122 days to get the first 100K gpus up and running and that was a
Monumental effort uh to be able to do that we believe it's the biggest uh fully connected h100 cluster of its kind
we actually decided that we need to double the size of the cluster pretty much immediately if we want to build the
kind of AI that we want to build so we then had another phas which we haven't
talked about publicly yet this is the first time that we're talking about this where we dou the capacity of the data center yet
again and that one only took us 92 days so we've been able to use all of these
gpus use all this compute to improve grock in the meantime and basically today we're going to present you the
results of that the fruits that came from that that's yeah so all the paths
all the rows lead to gr 3 10x more compute more than 10x really maybe 15x
compared to our previous generation model and gr finished the pre training early January and we start you know the
model still currently training actually so this is a little preview of our benon
GROK 3's Advanced Capabilities and Live Demos
numbers so we evaluated gr 3 on you know three different categories on General
mathematical reasonings on general knowledge about stem and Science and
then also on computer science coding so Amy American Invitational math
examination hosted you know once a year if we evaluate the model performance we
can see that the gr 3 across the board is in a league of its own even his
little brother gr mini is reaching the frontier across all the other
competitors you will say well at this point all these benchmarks you're just evaluating you know the memorization of
the textbooks memorization of the GitHub repost how about realtime usefulness how
about we actually use those models in our product so what we did instead is we
actually kicked off a blind test of our graphy model Cod named Chocolate it's
pretty hot yeah hot chocolate I've been running on this platform called Cho arena for two weeks I think the entire X
platform at some point speculated this might be the next generation of AI coming away so how this CH Arena works
is that it strip away the entire product surface right it just Raw compar
of the engine of those agis the language models themselves and place interface where the user will submit one single
query and you get to show two responses you don't know which model they come from and in there you make the vote so
in this blind test gr 3 an early version of Gro 3 already reached like 1,400 no
other models has reached an ELO score had to have comparison to all the other models at this score and it's not just
one single category is 1,400 aggregated across all the categories in chb capabilities
instruction following coding so it's number one across the board in this blind test and it's it's still climbing
so we actually to keep updating it so it's it's 14400 about 1400 in climbing
yeah
in fact we have a version of the model that we think is already much
better than the one that we tested here yeah we'll see you know how far
it gets
but that's the one that we're you know working on or talking about today yeah actually one thing if if you're if
you're using grock 3 you I think you may notice improvements almost every day um because we're we're continuously
improving the model so literally even within 24 hours you'll see improvements
yep so but we believe here at xai getting a best pre-training model is not
enough that's not enough to build the best AI and the best AI need to think like a human you to contemplate about
all the possible solutions self-critique verify all the solutions
backtrack and also think from the first principle that's a very important capability so we believe that as we take
the best pre-train model and continue training it with reinforcement learning
it will elicit the additional reasoning capabilities that allows the model to become so much better and scale not just
in the training time but actually in the test time as well so we already found the model is extremely useful internally
saving hundreds of hours of coding time so you the power user of our reasoning
model so what do use cases yeah so like Jimmy said we've added Advanced reasoning capabilities to Grog and we've
been
testing them pretty heavily over the last few weeks and want to give
you a little bit of a taste of what it looks like when Gro is solving
hard reasoning
problems so we prepared two little problems for you one comes from physics and one is actually a game that gr is
going to write for us when it comes to the physics problem know what we want gr to do is to plot a viable trajectory to
do a transfer from Earth to Mars and then at a later point in time a transfer back from Mars to Earth and
that requires some know some Physics that gr will have to understand so we're going to challenge grock you know come
up with a variable trajectory calculate it and then plot it for us so we can see
it and yeah this is totally unscripted by the way this is the gro interface and
we've typed in this text that you can see here generate code for an animated 3D plot of a launch from Earth landing
on Mars and then back to Earth at the next window and we've now kicked off the
query and you can see Gro is thinking so part of grock's advanced reasoning capabilities are these thinking traces
that you can see here you can even go inside and actually read what Gro is thinking as it's going through the
problem as it's trying to solve it yeah we say like we are doing some obscuration of the thinking so that our
model doesn't get totally copied instantly so there's more to the thinking than is displayed and because
this is totally unscripted there's actually a chance that Gro might made a little coding mistake and it might not
actually work so just in case we're going to launch two more instances of this so if something goes wrong we we
able to switch to those and show you something that's presentable right so we're kicking off the other two as well
and like I said we have a second problem as well actually one of our favorite activities here at xci is having gr
right games for us not just any no any old game any game that you might already
be familiar with but actually creating new games on the spot and being creative about us um so one example that we found
was really really fun um is create a game that's a mixture of the two games
Tetris and be so this is that maybe an important thing like this obviously if
you if you ask an AI to create a game like Tetris there's there are many examples of Tetris on the on the
Internet or a game like J whatever this it can copy it what's interesting here
is it achieved a creative solution combining the two games that actually
works and and is a good game yeah that's the it's create we're seeing the beginnings of creativity fingers crossed
that
we can recreate that hopefully it works hopefully actually because this
is a bit more challenging we're going to use something special here
which we call
Big Brain that's our mode in which we use more computation which more reasoning for GR just to make sure that
you know there's a good chance here that it might actually might actually do it um so we're also going to fire off know
fre attempts here at at solving this game at creating this game that's a
mixture of know Tetris and BJs um yeah let's let's see what Gro comes up like I've played the game it's
pretty good like it's like wow okay this is something yeah um so while Gro is
sping in the in the background um we can now actually talk about some concrete numbers know how how well is Gro doing
across tons of different tasks that we've tested on um so we'll hand it over to Tony to talk about that
yeah okay so let's see how Grog does on those interesting challenging benchmarks
uh so yeah so reasoning again refers to those models that actually thinks quite
for quite a long time before it tries to solve a problem so in this case U you
know around a month ago the gr 3 pre-training finishes so after that we
worked very hard to put the reasoning capability into the uh current graph B model but again this is very early days
so the model is still currently in training so right now what we're going to show to people is this beta version
of the grth reasoning model alongside we also are training a mini version of the reasoning model so essentially on this
plot you can see uh the grth 3 reasoning beta and then gr three mini reasoning
the grth three reason mini reasoning is actually a model that we train for much longer time and you can see that
sometimes you actually perform slightly better compared to the gr 3 reasoning this also just means that there's a huge
potential for the graph three reasoning because it's trained for much less time um so all right so let's actually look
at what how how it does on those three benchmarks so Jimmy also introduced
already so essentially we're looking at three different areas mathematics science and coding um and for math we're
picking this high school competition math problem um for science we actually pick those PhD level science questions
um and for coding it's also actually pretty challenging competitive coding and also some lead code which is some
code inter interview problems that people usually get when they interview for companies so on those benchmarks you
can see that the gro 3 actually perform quite well uh across the board compared
to other competitors um yeah so it's pretty promising these models are very smart so Tony what what what are those
shaded bars yeah so okay so uh I'm glad you asked this question so for those
models because it can reason it can think you can also ask them to even think longer uh you can spend more what we
call test and compute which means you can spend more time to reason to think
about a problem before you spit out the answer so in this case the Shaded bar
here means that we just ask the model to spend more more time you know you can
solve the the same problem many many times before it it tries to conclude what is the right solution and once you
give this compute or this this kind of budget to the model it turns out the model can even perform better so this is
essentially the Shaded bar in the window SPS right so I think this is really exciting right because now instead of
just doing one chain of thoughts with AI why not do multiple all at once yes so
that's a very powerful technique that allows to continue scale the model capabilities after training um and you
know people often ask are we actually just over fitting to the benchmarks yes so how iation so yes I think uh yeah
this is definitely a question that we are asking ourselves whether we are overfitting to those current benchmarks
uh luckily we have a real test so about five days ago Amy 2025 just finished
this is where high school students compete in this particular Benchmark so we got this very fresh new competition
and then we asked our two models to compete on the same Benchmark at the same exam and it turns out uh very
interestingly the gr three reasoning the big one um actually does better um on
this particular new fresh exam this also means that the generalization capability of the big model is stronger much
stronger compared to the smaller model if you compare to the last year's exam actually this is the opposite the
smaller model kind of learns the uh the previous exams better so yeah so this
this actually shows some kind of true generalization from the model that's right so 17 months ago our Gro zero and
grock one barely solved any High School problems that's right and now we have a kid that just already graduated the gr
grock is ready to go to college is that right yeah I mean it's won't be long before it's simply perfect the human
exams won't be part be too easy yeah like and internally we actually as gr
continue evolves uh we're going to talk about you know what we excited about but
very soon there will be no more Benchmark left yeah yeah one thing that's quite f fting I think is that we
basically only trained Rock's reasoning abilities on math problems and comparative coding problems so very very
specialized kinds of tasks but somehow it's able to work on all kinds of other
different tasks so including creating games no lots lots and lots of different things um and what seems to be happening
is that basically Gro learns this ability to detect its own mistakes and its thinking correct them persist on a
problem try lots of different variants pick pick the one that's best so there are these generalized generalizing
abilities that Gro learns from mathematics and from coding which it can then use to solve all kinds of other
problems so that's yeah that's pretty I mean reality is the instantiation of mathematics that's right um and one
thing we're actually really excited about that going back to our funing mission is what if one day we have a
computer just like deep thought that utilize our entire cluster just for that
one very important problem in the test time all the GP turned on right so I think back then we were building the GBU
cluster together uh you were plugging cables and I remember that when we turn
on the the first initial test you can hear all the GPS humming in the hallway
that's
almost feel like spiritual yeah that that's actually a pretty cool uh
thing that we're able to do that we can go into the data center
and Tinker with the machines there so for example we went in and we unplugged
a few of the cables and just made sure that our training setup is still running running stable so that's something that
you know think most AI you know teams out there don't usually do but it's actually totally unlocks like a new
level of reliability and what you're able to do with with the hyp so okay so when when are we going to solve remon so
uh the easiest solution is to uh numerate over all possible strings and
as long you have a verifier enough compute you'll be able to do it okay my projection will be what's your guess
what is your neural net calculate so my my my boat prediction so so three years ago I told you this I think in now it's
uh two years uh later two things going to happen we're going to see machines
win some medals yeah that's tours Award felds Medal Nobel Prize with probably
some expert in the loop right so the expert uplifting do you mean so this year or next year oh oh okay that's what
it comes down to really yeah so it looks like grock finished you all of its thinking on on the two problems so let's
take a look at what it said all right so this was the the little physics problem we had um no we we've collapsed the fs
here so they're know they're hidden and then we see gr's answer below that so it explains it wrote a python script here
using matplot lip then gives us all of the code um so let's take a quick look at the code you know seems like it's
doing reasonable things here not not totally of the Mark um solve Kepler says
here so maybe it's solving capist laws cap capus law numerically um yeah that's
really one way to find out if this thing is working I'd say let's let's give it a try let's run let's run the code all
right and we can see um yeah gr is animating two different planets Earth
and Mars here and then the the green ball is the the vehicle that's
transiting the spacecraft that's transitioning between Earth and Mars and you you could see the journey from Earth
to Mars and looks like yeah indeed the the astronauts return safely you know at
the right moment in time um so now obviously this was just generated on the spot so now we can't tell you if that
was
actually correct solution so we're going to take a closer look now
maybe were're going to call some colleagues from space X ask them if if
this is
legit um it's pretty close it's it's I mean uh yeah I mean there there there's
a lot of complexities in the actual orbits that have to be taken into account but this is this is pretty close
to to what it what looks like awesome um in fact I have that on my pendant here
got the Earth Mars home and transfer on it when when are we going to install Gro on a rocket well I suppose in two years
two years everything is two years away uh well Earth and Mars Transit can
occurs every 26 months the next we're currently in a Transit window approximately the next one would be um
November of next year um roughly end of next year um and uh if all goes well
SpaceX will send Starship Rockets to Mars with Optimus robots and uh and
Gro I'm curious what this combination of Tetris and the looks like the tetris as
we've named it internally um so okay we also have an output from gr here it says
what a python script explains that it's what it's been doing if you look at the the code now there are some constants
that are being defined here some colors then the the trinos the the pieces of
Tetris are there obviously very hard to see at one glance if this is good so we
got to we got to run this to figure out if it's working well let's let's give it a try fingers crossed all right so this
Exploring Game Mechanics: Tetris and Bejeweled
kind of looks like Tetris but the the colors are a little bit off right the colors are different here and
um if you think about what's going what's going on here the J has this
mechanic where you if you get three Jews in a row you know then they they disappear and also gravity activates
right so what happens if you get three of the colors together oh so something
happened um so I think I think what SC did in this version um is is that you
know once you connect three at least three blocks of the same color in a row then um no gravity activates and they
disappear and then gravity activates and all the other blocks fall down um kind of kind of curious if there's
still a Tetris mechanic here where if the line is full does it actually um
clear it or what happens then okay it's up to interpretation you know so who
knows yeah I mean when it'll do different variants when you ask it it doesn't do the same thing every time
exactly we've seen a few other that work very differently but this one seems cool so are we ready for uh game Studio at x.
Introducing X.AI's AI Gaming Studio
yes so we're launching uh an AI gaming studio at xai if you're interested in
joining us and building AI games uh please join xai we're launching an AI gaming studio we're announcing it
tonight let's go epic games but wait that's an actual game Studio yeah yeah
um all right so um I think one thing is super exciting for us uh is that once
you have the best Patron model you have the best reasoning model right so we
already see that when you actually give the capability for those model to think harder uh think longer think more broad
the performance continue improves and we're really excited about the next Frontier that what happen if we
not only allow the model to think harder but also provide more tools just like how real humans to solve those problems
for real humans we don't ask them to solve reman a hypothesis just with a piece of pen and paper no internet so
with all the basic web browsing search engine and code interpreters that builds
the foundations and the best reasoning model builds the foundations for the gro agent to come um so today we're actually
Unveiling DeepSearch: The Next-Gen Search Engine
introducing a new product called Deep search that is the first generation of
our Gro agents that not just helping the engineers and research and scientists to do coding but actually help everyone to
answer questions that you have day today it's a kind of like a next gener search engine that really help you to
understand the universe so you can start asking question like for example hey
when is the next Starship launch dat for example so let's try that if get the
answer um on the left hand side we see uh a high level progress bar essentially
you know the model now just going to do one single search like the current R system but actually thought very deeply
about hey what's the user intent here and what the facts I should consider at the same time and how many different
website I should actually go and read their content right so this can save
hundreds hours of everyone's Google time if you want to really look into certain topics and then on the right hand side
you can see the bullet of how the current model uh you know is doing what website is browsing what source is
verifying and often time actually cross validate different sources out there uh
to make sure the answer is actually correct before it's out final answer and we can you know at the same time fire up
a few more queries um how about you know you know you're a gamer right so uh sure
yeah so how about what are some of the best builds and most popular builds in PA Excel hardcore right hardcore League
I if you can technically just look at the hardcore ladder might be a fast way to figure it out yeah we'll see what
model does and then we can also do uh you know uh something more fun for
example um how about like make a about the marsh madness out there yeah so this
is kind of a fun one where um waren Buffett has a billion dollar bet if you can exactly match the I think the the
the sort of the entire winning tree of March Madness you can win a billion dollars from Warren Buffett so like
would be pretty cool if AI could help you win a billion dollars from Buffett that seems like a pretty good investment
let's go yeah all right so now let's uh fire up the query and uh see what model
does so we can actually go back to our very first one how about the it wasn't counting on this that's right okay so we
got the the first one and model thought uh around one minute uh so okay so the
key inside here the next stship is going to be on 24 or later so no earlier than
February 24th it might be sooner so yeah so I think we can you know go down
scroll down what what the model does so it does a little research flight seven what happened got grounded and actually
it look into the FCC filing uh uh you know from the data collections uh and
that should can make the new conclusion that yeah if we continue scroll down uh let's see uh uh right yeah so it makes
uh the you know little table I think uh inside xai we often joked about the time
to the first table is the only you know latency that matters yeah so that's how
the model make inference and look up all the sources and then we can look into the gaming one so how about the right so
for this particular one we look at hey the the build is
light so uh with the infernal is but if we go down so the surprising fact of all
the other builds so it look into the 12 classes so we see that the minum build
was pretty popular whenever the game first came out and now the the invoker of the world took over invoker monkey
invoker for sure yeah that's right yeah followed by the stor wavers then that's really good at mapping so yeah and then
we can see uh uh the the match manness how about that so one one interesting
thing about the Deep search is that if you actually go into the panel where it shows uh you know what are the subtasks
you can actually click the bottom left of this right and then in this is you
can actually scroll through actually reading through the mind of grock what informations does the model actually
think about are Tru worthy what are not how does it actually cross validate different information sources so that
makes the entire search experience and information retrieval process a lot more transparent to all users this is much
more powerful than any search engine out there you can literally just tell it only use sources from X you know I will
try to respect that and so it's much more steerable much more intelligent than I mean it really should save you a
lot of time so something that might take an hour or an hour of researching on the web or searching media you can just ask
it to go do that and and come back in 10 minutes later it's done an hour's worth of work for you that's really what it
comes down to exactly and maybe better than you could have done it yourself yeah think about INF of interns working
for you now you can just fire up all the tasks and come back a minute later this is going to be interesting one so uh
March M had not happened yet so I guess we had to follow up with a uh next live
stream yeah it seems like pretty good like $40 might get you a billion dollars
$40 subscription that's right I mean my work so uh yeah so when are the users
GNA have their hands on gr 3 yeah so the the good news is we've been working tirelessly to actually release all of
Grok 3: Features, Launch, and Future Plans
these features that we've shown you the gr free based model with amazing chat capabilities it's really useful that's
really interesting to talk to uh the the Deep search the advanced reasoning mode all of the things we want to roll them
out to you today starting with the plus subscribers on X so it's the first group
that will initially get access make sure to update your X app if you want to see all of the advanced capabilities because
we just released the update now as we're as we're talking here and uh yeah if you're interested in getting early
access to Gro then sign up for premium plus um and also um we we announcing that we're starting an separate
subscription for grock that we call Super grock for those who those grock fans that want the most advanced
capabilities and the earliest access to to new features um so feel free to check
that out as well this this is for the dedicated Gro app and for the website exactly so our our new website is called
gro.com yeah and you'll also find never guess yeah you never guess and you can also find our grock app in the IOS app
store and that gives you like a more Pol even even more polished uh experience that's totally Gro focused if you're if
you want to have Gro know easily available One Tap Away yeah the version on gro.com on uh you know on a web
browser is going to be the the most the latest and most advanced version because obviously takes us a while to get thing
get something into an app and then get it approved by the app store so uh and then if something's on a phone format
there's limitations what you can do so the most powerful version of Gro um and the latest version will be the the web
version at gro.com yeah so so watch out for the name grock free in the app did giveaway yeah exactly that that's that's
the giveaway that you have gr free and if it says gr through then it gr fre hasn't quite arrived for yet but we're
working hard to roll this out today um and then to even more people over the the coming days yeah make sure you
update your uh phone app too um where you're going to get all the tools we showcase today with the thinking mode
with the Deep search so yeah really looking forward to all the feedbacks you have yeah and I think we we should uh
emphasize that this is kind of a beta like meaning that it's you should expect some imperfections at first
um but we will improve it rapidly almost every day in fact every day I think it'll get better so if you want a more
polished version I'd like maybe wait a week but uh expect improvements literally every day um and then we're
also going to be uh providing a voice so you can have conversational in fact I
was trying it earlier today it's working pretty well but not we need these a bit more polish um the the the sort of way
where
you can just literally talk to it like you're talking to a person it's
that's awesome it's actually I think one of the best experiences of gr
um but
that's that's probably about a week away so uh with that said um well I think we
might have some audience questions sure yeah all right let take a look yeah
let's take a look the uh the audience from the as platform yeah cool so the
first question here is when Gro Voice Assistant when is it coming out as as as soon as possible just like Elon said
just a little bit of polishing away from being everybody um obviously it's going
to be released in an early form and we're going to rapidly iterate on that yeah and the next question is like when
will Gro 3 being the API so this is coming in the uh the gro 3 API with both
the reasoning models and deep search is coming your way in the coming weeks uh we're actually very excited about the
Enterprise use cases of all these additional tools that now Gro has access to and how the test time compute and
Tool use can actually really accelerate all the business use cases another one is uh will voice mode be native or text
to speech so I think that means is it going to be one one model that is understanding what you say and then
talking
back to you or is it going to be some system that has text to speech
inside of it and the good news is it's going to be one model like a
variant of
gr free that we're going to release which basically understands what you're say what you're saying and then uh
generates the audio no directly from that um so very much like Grog free generates text that model generates
audio um and that has a bunch of advantages I was talking to it earlier today and it said hi igore know reading
my my name from probably from some text that it had um and I said no no my name is Igor and it remember that you know so
it could continue to say Igor just like a human word and you can't achieve that with with Tex of speech
so yeah so oh here's a question for you pretty spicy um you know um is grog a
boy or a girl and they Grog is whatever you wanted to yeah yeah are you single yes all right
Shop is open um so honestly people are going to fall in love with Gro it's it's
like 1,000% probable uh the next question will Gro be able to transcribe audio into text yes so we'll have this
capability in both the app and also the API we found that's like gr should just be your personal assistant looking over
your shoulder right and follow you along the way learn everything you have learned and really help you to
understand the world better become smarter every day yeah I mean the voice M gr doesn't isn't simply it's not just
voice text it understands like tone inflection pacing everything it's it's
wild I mean it's like talking to a person yep so any plans for conversation memory absolutely we're working on it
right now that's right um let's see what are the other ones so what about the you
know the DM features right so if you have personalizations and if if you have uh you know Gro remembers your previous
interactions yes should it be one Gro or multiple different GRS it's up to you
you can have one Gro or many GRS yeah I suspect people will probably have than
one yeah I want to have a Dr grock yeah the grock dog that's right all right
well um so in the past we've open sourced gr one right so somebody's asking us are we going to do that again
with gr to yeah I think um one once gr our general approach is that we will
open source the last version when the next version is fully out so like when
when gr 3 is um mature and stable which is probably within a few months then
we'll open source gr too okay so we probably have time for one last question um what was the most
difficult part about working on this project I assume uh gr 3 and what I most
excited about so I think me looking back you know getting the whole model
training on the 100K h100 coherently that's almost like battling against the
final boss of the universe the entropy because any given time you can have a cosmic rate that beaming down and flip a
bit in your transistor and now the entire grading update if it's fit mantisa bit the entire grading update is
out of whack and now 100 thousand of those and orchestrate them every time any at at any given time any of gpus can
go down and yeah I mean it's worth breaking down like how were we able to
uh get the world's most powerful training cluster operational within 122 days because we we started off um we we
actually weren't intending to do a data center ourselves we were going to just we went to the data center providers and
said how long would it take to have 100,000 uh gpus operating coherent
um in a single location and we got time frames from 18 to 24 months so we're
like well 18 to 24 months that means losing is a certainty so the only option was to do it do it ourselves then if you
break down the problem I guess some doing like reasoning here like makes you think one single chain though yeah yeah
exactly so um well we needed a building we can't build a building so we must use an existing building um so we we looked
for um basically for factories that had been um were that have been abandoned but the
factory was in good shape like a company had gone bankrupt or something so we found an Electrolux Factory in me in
Memphis that's why it's in Memphis um home of Alvis um and also one of the oldest I think that was the capital of
ancient Egypt um and uh it was actually very nice Factory that I know for
whatever reason that electrox had left um and uh that that gave us shelter for
of the computers uh then we needed did power the um we needed um at least 120
megaw at first but the building only had 15 megaw and ultimately for 200,000 me 200,000 gpus we needed a quarter gwatt
so we um initially uh leased a a whole bunch of um generators so we have
generators on one side of the building just one trailer after trailer trailer of generators until we can get the
utility power to to come in but then we also need cooling so on the other side of the building it was just trailer
after trailer of of cooling so we leased about a quarter of the mobile cooling capacity of the United States uh on the
one other side of the building then we needed to get the gpus all installed and they're all liquid cooled so in order to
achieve the density necessary this is a liquid cooled system so we had to get all the plumbing for liquid cooling
nobody had ever done a liquid cooling uh data center at scale so this was a
incredibly dedicated effort by a very talented team to achieve that outcome um I may think not now it's going to work
nope um the the issue is that the the power fluctuations for a GPU cluster are
dramatic so it's it's like a this giant Symphony that is taking place like imagine having a symphony with 100,000
or 200,000 participants in the in the symphony and the whole Orchestra will go quiet and loud in you know 100
milliseconds and so this caused massive power fluctuations so then um which then
caused the generators to lose lose their minds and they weren't expecting this so to buffer the power we then uh used
Tesla megapacks uh to smooth out the power so the megapacks had to be
reprogrammed so with with xai we working with Tesla we reprogrammed the mega
packs to be able to deal with these dramatic power fluctu fluctuations to smooth out the power the computers could
actually run properly and um that that worked uh quite icky and uh and then but even at
that point you still have to make the computers all communicate effectively so all the networking had to be solved and
uh debugging a brazillion network cables um a debugging nickel at 4: in the
morning we soled it like roughly 4:20 a.m. yesk was figured out like there's
some well there were a whole bunch of issues well one there was like a bios mismatch the BIOS was not set up
correctly yeah we we had our lspci outputs between two
different machines one that was working yeah one that was not working yeah many many other things I mean yeah exactly
this
would go on for a long time if we actually listed all the things but
you know it's like interesting like it's not like oh we just magically
made it happen
you
have to break down the problem just like gr does for reasoning into the
constituent elements and then solve each of the constituent elements in
order to
achieve uh a coherent training cluster in a period of time that is a small fraction of what anyone else was could
do
it in and then once the training cluster was up and running and we
could use it we had to make sure that it actually stays healthy
throughout which
is his own right giant Challenge and then we had to get every single detail of the training right in order to get a
gr level model which is actually really really hard so um we don't know if there are any other models out there that have
gr's capabilities but whoever trains a model better than gr has to be extremely good at the the science of deep learning
at every aspect of the engineering so it's it's not so easy to pull this off and this is now going to be the last
cluster we build and last Model we train oh yeah we've already we've already started work on the next cluster which
will be about five times the power so instead of a quarter gwatt roughly 1.2
GW what's the what's the Back to the Future wor what's the power on do like
the Back to the Future car anyway Back to the Future power it's it's like roughly in that order I think um so um
you know these will be the sort of the gb2 200/300 cluster it it once again will
will
be the most powerful training cluster in the world so we're not like
stopping here and our reasoning model is going to continue improve by
accessing
more tools every day so yeah we're very excited to share any of the upcoming results with you all yeah the thing that
keeps us going is basically being able to give free to you and then seeing the usage go up seeing everybody enjoy no
No comments:
Post a Comment