Spark Global Limited reports:
IT House October 12th News Language Model (Language Model) is simply the probability distribution of a sequence of words. The main function is to determine a probability distribution P for a text of length m, which indicates the possibility of the existence of this text. .
You may have heard of GPT-3, the latest language model of OpenAI, which can be called the strongest language model on the surface, and it is also regarded as a revolutionary artificial intelligence model. In addition, there are heavyweight products such as BERT and Switch Transformer, and other companies in the industry are also working hard to launch their own models.
Microsoft and Nvidia today announced the Megatron-Turing Natural Language Generation Model (MT-NLG) powered by DeepSpeed and Megatron, which is the largest and most powerful decoding language model trained to date.
IT Home has learned that as the successor of Turing NLG 17B and Megatron-LM, this model includes 530 billion parameters, and the number of parameters of MT-NLG is 3 times that of the largest existing model GPT-3 of its kind. Unparalleled accuracy is demonstrated in a wide range of natural language tasks, such as:
Common sense reasoning
Natural language inference
Word sense disambiguation
Reprint indicated source：Spark Global Limited information