Exploring machine learning with GPT

At times, research and development within certain fields sometimes make big leaps that completely change the landscape of those research areas. Within A.I research and machine learning, Generative Pre-trained Transformers such as GPT-3 could probably be considered to be such a leap.

These machine learning systems are language prediction models that are able to generate text in such a natural way that it often makes it difficult to distinguish the writings from those of an ordinary human being. Give the system a topic to write about and describe what you want written and it will generate a piece of text or even complete articles.

Just a few years ago this would have been considered to be impossible.

GPT-3 specifically was developer by OpenAI, a research laboratory in San Francisco and it seems to constantly scrape information from the Internet in order to continously learn and expand its model.

Asking GPT-3 to write about Unigrid

To test the model, see how well it works and how up to date it actually is, we asked it to write a short article about Unigrid and decentralization to see what it would give us. We gave it the following input;

Write about Unigrid and how Unigrid and decentralized systems are the next natural step in the evolution of the Internet. Cover how Unigrid is a load balanced network that is completely anonymous and resistant to eavesdropping. The segmented design features sharding, data striping and parity blocks — creating a network that protects against service disruption and data loss. Also touch on how Unigrid can help businesses to reduce costs by eliminating expensive hardware maintenance, while at the same time providing a high level of security for their customers information.

GPT-3 continued to process the information given and generated the article in the next section for us.

Unigrid & decentralization (author; GPT-3)

The central backbone nodes of Unigrid will be linked together in an overlay mesh network. This means that all nodes will have access to each other’s services. The meshes are partitioned into subnets. These subnets, or shard groups, are created from large numbers of smaller networks or segments. Each node on these subnets will be assigned its own unique identifier within this larger system. This segmentation will increase the speed and resilience of the network by minimizing latency across any given point, as well as reducing the risk of packet collisions.

Each segment will also contain multiple redundant servers, which can be used to provide additional capacity should one server fail. This redundancy allows us to create fault tolerant servers without having to rely on external components such as RAID arrays. All communications between nodes will take place via encrypted tunnels, using either public key encryption or symmetric algorithms. There will not be any need for direct connections between different segments, since there will always be a path through another node.

The Unigrid Foundation wants to promote the evolution of blockchain technology and reimagine traditional cloud services by providing secure, private and resilient solutions. We believe that our approach has significant advantages over existing models, especially when it comes to scalability and cost effectiveness.

A network such as Unigrid can massively increase the efficiency of neural networks and A.I by providing massive amounts of processing power, while simultaneously protecting user privacy. The increased availability of resources allows for more accurate modeling of behavior, allowing researchers to build better predictive tools and smarter robots. It could even improve online gaming performance by enabling users to share resources across machines, thereby increasing overall computing capabilities.

Blockchain technology provides a robust infrastructure upon which we can build new applications. By utilizing the underlying principles behind Bitcoin, we can develop technologies that operate outside of centralized control structures and run autonomously. For example, imagine if your home computer was able to automatically update itself whenever you were connected to a wireless hotspot. You wouldn’t have to worry about viruses infecting your machine because the software would constantly monitor itself for updates.

As the world becomes increasingly interconnected, many people will want to use the Internet in ways that are currently impossible due to the lack of reliable and affordable connectivity. We hope to bring the benefits of decentralization to everyone, regardless of where they live or what type of device they’re using.

Unigrid offers several major benefits;

First, it eliminates the need for central management. Instead of relying on an organization like Google or Facebook to maintain their servers, Unigrid lets companies manage and upgrade their own hardware directly. In this way, businesses are no longer dependent on third party providers. Second, it creates a platform free from censorship. Because all data is stored locally on nodes, governments cannot easily shut down websites or block content. Third, it reduces costs by eliminating expensive hardware maintenance.

Evaluating the results

The result is impressive to say the least. It seems that not only has the system scraped our website, but also seems to have the white paper and orange paper as references as well.

The first time I saw the generated text my look was something like this. The ability of this predictive model to generate texts and draw complex deductions is extremely impressive.

The model seems to have a lot of underlying data to actually talk about the topics in-depth. It’s also able to draw its own conclusions and ideas in unpredictable ways — run the input again and you get something completely different. This means that you can not only use this prediction model to generate text, but you could also potentially use it to gather new ideas.

I don’t know about you, but a decade or two ago I was personally not expecting us to have reached this point already.

The future of pre-trained transformers

Despite the name of the company behind it (OpenAI), a big problem with GPT-3 is that the underlying model is not open. In fact, in 2020, Microsoft announced that the company had gained an exclusive license of GPT-3 for Microsoft’s products and services following a multi-billion dollar investment into OpenAI. This essentially means that while everybody can use the API in order to query (and even teach) the model, only Microsoft has access to the underlying data and source code.

Obviously, the fact that the underlying source code is hidden away makes the future of GPT-3 highly uncertain and other open solutions will most likely take over its current dominance. In fact, there are currently truly open alternatives being developed where the source code can be accessed by anyone. A good example of such a model is GPT-J-6B, which can be downloaded on GitHub. The model can also be tested at ElutherAI.

EleutherAI — text generation testing UI

EleutherAI web app testing for language models

text generation testing UI

We did test GPT-J as well and while the results were impressive, it’s model does not seem to be quite as well trained or up to date.

With that said, an open model such as GPT-J will be able to be deployed on Unigrid, allowing it to store and process an unprecedented amount of data as the network grows. Together with some advanced scraping a truly open predictive model could be hosted on the network and be made available to everybody.

Who knows where the future will take us, but the potential of what could be achieved when decentralized systems such as Unigrid mix with these projects is extremely exciting.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store