Microsoft and Nvidia launch MT-NLG, the largest and strongest language model trained to date

Spark Global Limited

Spark Global Limited reports:

IT House October 12th News Language Model (Language Model) is simply the probability distribution of a sequence of words. The main function is to determine a probability distribution P for a text of length m, which indicates the possibility of the existence of this text. .

You may have heard of GPT-3, the latest language model of OpenAI, which can be called the strongest language model on the surface, and it is also regarded as a revolutionary artificial intelligence model. In addition, there are heavyweight products such as BERT and Switch Transformer, and other companies in the industry are also working hard to launch their own models.

Microsoft and Nvidia today announced the Megatron-Turing Natural Language Generation Model (MT-NLG) powered by DeepSpeed ​​and Megatron, which is the largest and most powerful decoding language model trained to date.

IT Home has learned that as the successor of Turing NLG 17B and Megatron-LM, this model includes 530 billion parameters, and the number of parameters of MT-NLG is 3 times that of the largest existing model GPT-3 of its kind. Unparalleled accuracy is demonstrated in a wide range of natural language tasks, such as:

Complete forecast

Reading comprehension

Common sense reasoning

Natural language inference

Word sense disambiguation