TAPAGI: Towards A Proto-AGI

20 min readDec 11, 2022

TAPAGI, or Towards a Proto AGI, is a new research project that aims to explore the concept of artificial general intelligence (AGI) and develop a proto-AGI system. The goal of the project is to build a system that is capable of exhibiting intelligent behavior and solving complex problems, similar to the human brain.

(note: this was mostly written by AI with guidance about the structure/systems/idea from myself, it filled in the blanks and expanded it out. This is a first step)

To achieve this goal, the project is using a multi-disciplinary approach that combines techniques and methods from fields such as computer science, neuroscience, psychology, and philosophy. The project is also using a modular and hierarchical architecture, in which the system is made up of multiple specialized subsystems, each of which performs a specific task or function.

The project is committed to open-source principles and practices, and all of the code, data, and other resources developed as part of the project are freely available to the public.

There are several reasons why the TAPAGI project is open-source. First, open-source principles and practices promote collaboration, transparency, and accountability, which are essential for advancing the field of AGI and ensuring its responsible development. By making the project’s resources and materials openly available, the TAPAGI project encourages collaboration and engagement with other researchers, and allows the broader community to contribute to review, and validate the project’s work.

Second, open-source practices promote innovation and creativity, as they enable researchers to build on and extend each other’s work, and to combine and integrate different approaches and methods. By making the project’s resources and materials openly available, the TAPAGI project enables researchers to reuse and adapt the project’s work, and to develop new and creative solutions to the challenges of AGI.

Third, open-source practices promote accessibility and inclusivity, as they enable researchers from different backgrounds and disciplines to access and use the project’s resources and materials. By making the project’s resources and materials openly available, the TAPAGI project allows researchers from diverse fields and backgrounds to participate in the project, and to contribute to the project’s work.

The TAPAGI project is committed to open-source principles and practices, as we believe that these principles are essential for advancing the industry.

Five Systems for a Proto-AGI

In recent years, there has been increasing interest in the development of artificial general intelligence (AGI), which is a type of artificial intelligence that is capable of exhibiting intelligent behavior and solving complex problems, similar to the human brain. One approach to achieving AGI is to build a system that is made up of multiple specialized subsystems, each of which performs a specific task or function.

This approach, which we will refer to as a proto-AGI, consists of five main systems: the input bus, the scene construction and simulation system, the output generation and simulation system, the judgment system, and the output bus. Each of these systems has a specific role and function, and together they enable the proto-AGI to collect information, generate responses, make decisions, and interact with its environment.

In this article, we will introduce these five systems and explain their functions and roles in more detail. We will also describe how these systems work together to enable a proto-AGI to function, and discuss the challenges and limitations of these systems. By understanding these systems and their capabilities, we can gain insight into the nature of AGI, and the challenges and opportunities that it presents.

The input bus is the system that processes and organizes the various types of input that the proto-AGI receives. This could include audio, video, text, data, and other types of input. The input bus is responsible for converting the raw input into a format that the other systems can understand and use.
The scene construction and simulation system is responsible for creating a virtual environment in which the proto-AGI can interact with its surroundings. This system uses the input from the input bus to create a realistic simulation of the environment, including objects, people, and other entities.
The output generation and simulation system is responsible for generating responses to the input that the proto-AGI receives. This system uses the scene construction and simulation to generate appropriate responses, such as answering questions, providing information, or taking action in the simulated environment.
The judgment system is responsible for evaluating the various responses generated by the output generation and simulation system, and selecting the best response based on the current situation. This system uses various criteria, such as the relevance of the response, the likelihood of success, and the potential consequences, to make its decisions.
The output bus is the system that converts the selected response into the appropriate format for output. This could include audio, text, movement, data, or other types of output. The output bus is responsible for delivering the response to the appropriate destination, such as a person, device, or system.

These systems work together to enable the proto-AGI to receive and process input, generate and evaluate responses, and produce appropriate output. The goal is to create a system that can exhibit intelligent behavior and interact with its environment in a meaningful way.

Input Bus

The input bus is a component of a proto-AGI that is responsible for collecting and processing input from various sources. This input can include speech, text, images, sensors, and data, and can be used to construct a scene, generate responses, and make decisions. The input bus uses a variety of tools and techniques, such as natural language processing algorithms, computer vision algorithms, and sensor data analysis algorithms, to extract relevant information and understand the meaning of the input. The input bus is a critical component of the proto-AGI, as it provides the AGI with the information and knowledge it needs to function and achieve its goals.

Here is a list of possible input methods for a proto-AGI:

Speech: The proto-AGI could use speech recognition algorithms to process and analyze spoken input, such as conversations and commands. This could involve using microphones to capture the speech, and using natural language processing (NLP) algorithms to convert the speech into text, extract relevant information, and understand the meaning of the input.
Text: The proto-AGI could use natural language processing (NLP) algorithms to process and analyze text input, such as written documents and messages. This could involve using optical character recognition (OCR) algorithms to convert scanned or digital text into machine-readable form, and using NLP algorithms to extract relevant information and understand the meaning of the input.
Images: The proto-AGI could use computer vision algorithms to process and analyze visual input, such as images and videos. This could involve using cameras and other sensors to capture the images, and using computer vision algorithms to extract information about the objects, people, and scenes depicted in the input.
Sensors: The proto-AGI could use sensors to capture and analyze various types of physical input, such as temperature, humidity, pressure, and motion. This could involve using sensors to measure the physical properties of the environment, and using algorithms to extract relevant information and understand the meaning of the input.
Data: The proto-AGI could use data mining and machine learning algorithms to process and analyze data input, such as databases and datasets. This could involve using algorithms to extract relevant information and patterns from the data, and using this information to make predictions and decisions.

Scene Construction and Simulation

Scene construction and simulation is the process of using input from the input bus to create a virtual representation of a scene. This involves analyzing the input, such as speech, text, images, and sensors, to extract information about the objects, people, and events in the scene. This information is then used to construct a virtual representation of the scene, which can be used to generate responses, make decisions, and simulate the behavior of the entities in the scene. Scene construction and simulation is a critical component of a proto-AGI, as it provides the AGI with a representation of the world and the ability to interact with it in a realistic and meaningful way.

Here are some examples of specific tools that can be used to build up a scene using input from the input bus:

Computer vision algorithms: Computer vision algorithms can be used to analyze visual input, such as images and videos, and extract information about the objects, people, and scenes depicted in the input. This information can be used to build a virtual representation of the scene.
Natural language processing algorithms: Natural language processing (NLP) algorithms can be used to analyze text input, such as speech and written text, and extract information about the objects, people, and events mentioned in the input. This information can be used to build a virtual representation of the scene.
Scene graphs: Scene graphs are data structures that can be used to represent the objects, people, and events in a scene. These graphs typically consist of nodes and edges, where the nodes represent the entities in the scene, and the edges represent the relationships between the entities. Scene graphs can be generated from visual and text input, and can be used to build a virtual representation of the scene.
Virtual environments: Virtual environments, such as video games and simulation platforms, can be used to create a realistic and interactive representation of a scene. These environments typically consist of 3D models, textures, and other visual elements, as well as rules and algorithms that govern the behavior of the entities in the scene. Virtual environments can be populated with information extracted from the input bus, and can be used to build a virtual representation of the scene.
Object detection algorithms: Object detection algorithms are a type of computer vision algorithm that can be used to identify and locate objects in an image or video. These algorithms can analyze the visual input to detect the presence of specific objects, such as cars, buildings, or people, and can provide information about the location and orientation of the objects in the scene.
Part-of-speech tagging algorithms: Part-of-speech tagging algorithms are a type of natural language processing algorithm that can be used to identify the parts of speech (such as nouns, verbs, and adjectives) in a piece of text. These algorithms can analyze the text input to identify the entities and relationships mentioned in the text, and can provide information about the roles and characteristics of the entities in the scene.
Scene segmentation algorithms: Scene segmentation algorithms are a type of computer vision algorithm that can be used to divide an image or video into regions or segments, based on the similarity of the pixels or objects in each region. These algorithms can be used to segment a scene into different objects, people, and backgrounds, and can provide information about the composition and layout of the scene.
Interactive 3D modeling software: Interactive 3D modeling software, such as Blender and Maya, can be used to create and manipulate 3D models of objects, people, and scenes. These tools typically provide a user-friendly interface for designing and editing the models, and can be used to create a virtual representation of the scene based on the input from the input bus.
Sentiment Analysis: Sentiment analysis is the process of using natural language processing algorithms to identify the sentiment of a piece of text. This involves analyzing the words and phrases used in the text, as well as the context in which they are used, to determine whether the overall sentiment of the text is positive or negative. Sentiment analysis can be used in a variety of applications, including scene construction and generation of responses.

Sentiment Analysis

To perform sentiment analysis using input from the input bus, you can use a natural language processing (NLP) algorithm that has been trained to identify the sentiment of text. For example, you could use a pre-trained transformer model, such as the BERT model, to classify the text input as positive or negative. This could involve the following steps:

Tokenize the input: The first step is to tokenize the input text, which involves dividing the text into individual words or phrases that can be analyzed by the NLP algorithm. This can be done using a tokenizer, which is a tool that is specifically designed for this task.
Encode the tokens: Once the input text has been tokenized, the next step is to encode the tokens as input to the NLP algorithm. This involves converting the tokens into a numerical representation that the algorithm can understand and process. This can be done using an encoder, which is a tool that is specifically designed for this task.
Classify the sentiment: The final step is to use the NLP algorithm to classify the sentiment of the input text. This involves feeding the encoded tokens into the algorithm, and receiving a classification of the sentiment as positive or negative. The classification is typically based on a probability, where a higher probability indicates a stronger sentiment.

Sentiment analysis is a complex task that involves several steps and tools, including tokenization, encoding, and classification. By using a pre-trained NLP algorithm, such as the BERT model, you can perform sentiment analysis on input from the input bus to determine the sentiment of the text.

Output Generation and Simulation

Output generation and simulation is the process of using the virtual representation of a scene, as well as other inputs and knowledge, to generate and simulate responses. This involves using algorithms and techniques, such as rule-based systems, decision trees, and neural networks, to generate a set of potential responses for a given situation. These responses are then evaluated and ranked based on various criteria, such as relevance, accuracy, and fluency, to select the best response. The selected response is then simulated, to evaluate its effects and consequences in the virtual representation of the scene. Output generation and simulation is a critical component of a proto-AGI, as it enables the AGI to generate appropriate and meaningful responses to the inputs it receives.

To generate potential actions based on given inputs, you can use a variety of tools and techniques, such as:

Rule-based systems: Rule-based systems are a type of algorithm that can be used to generate potential actions based on a set of predefined rules. These rules specify the conditions under which an action should be taken, and the action to be taken in each case. For example, a rule might specify that if the input contains the phrase “turn on the lights,” the action to be taken is to send a signal to a light switch to turn on the lights.
Decision trees: Decision trees are a type of machine learning algorithm that can be used to generate potential actions based on a set of predefined criteria. These algorithms work by constructing a tree-like model that represents the different possibilities and outcomes for a given situation. The algorithm then uses the model to make predictions about the best action to take, based on the input it receives.
Neural networks: Neural networks are a type of machine learning algorithm that can be used to generate potential actions based on a set of examples. These algorithms work by learning from training data to identify patterns and make predictions about the best action to take. Neural networks can be trained to generate potential actions for a wide variety of tasks, such as language translation and image classification.
Simulated environments: Simulated environments, such as video games and simulation platforms, can be used to generate potential actions based on the current state of the environment. These environments typically contain rules and algorithms that govern the behavior of the entities in the environment, and can be used to generate actions that are appropriate and realistic for the given situation.
Decision diffusion: Decision diffusion is a technique that can be used to generate potential actions by combining the input from multiple sources, such as sensors and other agents, using a diffusion process. This technique can be used to generate more diverse and creative actions, and to improve the reliability and robustness of the action generation process.

Short Term Memory

Short-term memory, also known as working memory, is a cognitive system that allows the brain to temporarily store and manipulate information. This system plays a crucial role in output generation and simulation, as it enables the proto-AGI to access and use the information it has collected from the input bus to generate appropriate and relevant responses.

Short-term memory is a limited-capacity system, which means that it can only hold a small amount of information at any given time. This information is encoded in a variety of formats, such as verbal, visual, and spatial, and can be accessed and manipulated through a variety of processes, such as rehearsal, chunking, and manipulation.

Short-term memory is important for output generation and simulation because it enables the proto-AGI to access and use the information it has collected from the input bus to generate responses. For example, if the input bus receives a spoken question, such as “What is the capital of France?”, the short-term memory system can store and manipulate this information to generate a response, such as “The capital of France is Paris.”

Short-term memory is also important for output generation and simulation because it enables the proto-AGI to maintain context and coherence in its responses. For example, if the input bus receives a series of related questions, such as “What is the capital of France?”, “What is the currency of France?”, and “What is the population of France?”, the short-term memory system can store and manipulate this information to generate coherent and consistent responses, such as “The capital of France is Paris, the currency is the euro, and the population is approximately 66 million.”

Short-term memory is a crucial component of output generation and simulation, as it enables the proto-AGI to access, manipulate, and use the information it has collected from the input bus to generate appropriate and relevant responses. By using short-term memory, the proto-AGI can maintain context and coherence in its responses, and generate responses that are more accurate and fluent.

Regular Retraining and Fine-Tuning

Regular retraining and fine-tuning is important to output generation and simulation because it enables the proto-AGI to continuously improve and adapt its performance. This is especially important in dynamic and changing environments, where the information and inputs collected by the input bus are constantly evolving and changing.

Regular retraining and fine-tuning involves using new and updated information and inputs from the input bus to update and improve the algorithms and models used by the output generation and simulation system. This can involve using techniques such as supervised and unsupervised learning, reinforcement learning, and transfer learning to train and fine-tune the algorithms and models.

Regular retraining and fine-tuning is important because it enables the proto-AGI to maintain and improve its performance over time. As the information and inputs collected by the input bus change, the algorithms and models used by the output generation and simulation system can become outdated or incorrect. By regularly retraining and fine-tuning these algorithms and models, the proto-AGI can ensure that they remain accurate and relevant, and can adapt to changes in the environment.

Regular retraining and fine-tuning is also important because it enables the proto-AGI to learn from its mistakes and improve its performance. As the proto-AGI generates responses and interacts with its environment, it can collect feedback and metrics about its performance. By using this feedback to retrain and fine-tune its algorithms and models, the proto-AGI can learn from its mistakes and improve its performance over time.

Regular retraining and fine-tuning is a crucial component of output generation and simulation, as it enables the proto-AGI to continuously improve and adapt its performance. By using regular retraining and fine-tuning, the proto-AGI can maintain and improve its performance over time, and adapt to changes in the environment. This enables the proto-AGI to generate more accurate and relevant responses, and to operate more effectively and efficiently. Regular retraining and fine-tuning also enables the proto-AGI to learn from its mistakes and improve its performance, which can help it achieve its goals and objectives more successfully. In summary, regular retraining and fine-tuning is essential for the success and effectiveness of output generation and simulation in a proto-AGI.

Judgment

Judgment is the process of evaluating and selecting the best response from a set of potential responses. This involves using various criteria and techniques, such as heuristics, metrics and scoring systems, and human evaluation, to compare and rank the potential responses. The judgment system then selects the response that is most likely to achieve the desired goals and objectives, based on the information and knowledge available to the proto-AGI. The judgment system is a critical component of a proto-AGI, as it enables the AGI to make informed and intelligent decisions about how to respond to the inputs it receives.

On a technical level, a judgment system would need to perform the following tasks:

Receive input: The judgment system would need to receive input from the other systems in the proto-AGI, such as the scene construction and simulation system, and the output generation and simulation system. This input would include information about the current situation, the potential responses, and the criteria for evaluating the responses.
Evaluate responses: The judgment system would need to evaluate the potential responses based on the input it receives. This could involve applying various algorithms and heuristics to determine the relevance, likelihood of success, and potential consequences of each response.
Select the best response: The judgment system would need to select the best response from the pool of potential responses. This could involve using a combination of factors, such as the relevance of the response, the likelihood of success, and the potential consequences, to make a decision.
Communicate the selected response: The judgment system would need to communicate the selected response to the other systems in the proto-AGI, such as the output generation and simulation system, and the output bus. This would involve formatting the response in a way that the other systems can understand and use.

To implement these tasks, the judgment system would need to use a combination of machine learning algorithms, heuristics, and data structures. It would also need to have access to large amounts of training data and resources, such as computing power and memory, to support its operations.

There are a variety of tools and techniques that can be used to evaluate responses. Some examples include:

Machine learning algorithms: Machine learning algorithms, such as decision trees and support vector machines, can be used to evaluate responses based on a set of predefined criteria. These algorithms can learn from training data to identify patterns and make predictions about the best response.
Heuristics: Heuristics are rules of thumb that can be used to evaluate responses based on common sense and experience. For example, a heuristic might suggest that a response is more likely to be correct if it contains relevant information and is expressed in a clear and concise manner.
Metrics and scoring systems: Metrics and scoring systems can be used to evaluate responses based on objective criteria, such as accuracy, relevance, and completeness. These systems can provide a quantitative measure of the quality of a response, which can be used to compare different responses and select the best one.
Decision trees: Decision trees are a type of machine learning algorithm that can be used to evaluate responses based on a set of predefined criteria. These algorithms work by constructing a tree-like model that represents the different possibilities and outcomes for a given situation. The algorithm then uses the model to make predictions about the best response, based on the input it receives.
Support vector machines: Support vector machines (SVMs) are another type of machine learning algorithm that can be used to evaluate responses. These algorithms work by finding a boundary or “hyperplane” that best separates different classes of responses. The algorithm then uses the boundary to classify new responses as belonging to one of the classes, based on their characteristics.
F1 score: The F1 score is a metric that can be used to evaluate the quality of a response. This metric is calculated by taking the harmonic mean of the precision and recall of the response, where precision is the proportion of correct responses among all the responses, and recall is the proportion of correct responses among all the correct responses.
BLEU score: The BLEU score is a metric that is commonly used to evaluate responses in the field of natural language processing. This metric measures the similarity between a generated response and a reference response, using a weighted combination of precision and recall. The higher the BLEU score, the more similar the generated response is to the reference response.
Human evaluation: In some cases, responses may need to be evaluated by human experts, who can provide a subjective assessment of the quality of the responses. This can be useful for tasks that require a high level of precision and accuracy, or for situations where the other tools are not able to provide reliable results.

Training the Judge

Training the judge subsystem of a proto AGI on philosophy, morals, and ethics using publicly available text sources such as Project Gutenberg can be a valuable way to improve the accuracy and effectiveness of the system. Project Gutenberg is a digital library that offers over 60,000 free ebooks, many of which are classic works of philosophy, ethics, and morality. By using these texts as part of the training corpus for the judge subsystem, you can ensure that the system has access to a wide range of diverse and high-quality sources of information on these topics.

To train the judge subsystem using Project Gutenberg and other publicly available text sources, you could use natural language processing techniques to extract relevant information from the texts and build a corpus of philosophical, moral, and ethical concepts and ideas. This corpus could then be used to train a machine learning model that can identify and classify these concepts and ideas, which could then be used by the judge subsystem to evaluate potential responses to input.

Additionally, you could also use other techniques, such as crowdsourcing, to gather additional information and feedback on the accuracy and effectiveness of the judge subsystem. For example, you could solicit input from a diverse group of volunteers who have expertise in philosophy, morals, and ethics, and use this feedback to refine the training of the judge subsystem and improve its performance.

Using publicly available text sources such as Project Gutenberg and other techniques can be a useful way to train the judge subsystem of a proto AGI on philosophy, morals, and ethics. By gathering a large and diverse corpus of information and using machine learning algorithms to process this information, you can improve the accuracy and effectiveness of the judge subsystem and enable it to make more informed and appropriate decisions.

Output Bus

The output bus is a component of a proto-AGI that is responsible for delivering the responses generated by the output generation and simulation system to their intended recipients, as well as for monitoring and controlling the effects of the responses on the environment. The output bus performs a variety of tasks, such as:

Converting the responses to the appropriate format: The output bus can convert the responses from their internal representation to the appropriate format for delivery, such as text, speech, or data. This can involve using text-to-speech algorithms to convert text responses into speech, or using image processing algorithms to convert visual responses into images or videos.
Routing the responses to the appropriate destination: The output bus can route the responses to the appropriate destination, based on the intended recipient and the delivery method. This can involve sending the responses to a user or device, such as a speaker or display, or storing the responses in a database or other storage system.
Monitoring the delivery of the responses: The output bus can monitor the delivery of the responses to ensure that they are delivered successfully and without errors. This can involve checking for errors or failures in the delivery process, and retrying the delivery if necessary.
Controlling the effects of the responses: The output bus can control the effects of the responses on the environment, by monitoring the response and the environment, and taking corrective action if necessary. This can involve using sensors and feedback mechanisms to detect any unintended or undesirable effects of the response, and adjusting the response or the environment to mitigate these effects.

The output bus is a critical component of a proto-AGI, as it enables the AGI to deliver its responses to the appropriate recipients, and to ensure that the responses have the desired effects on the environment.

By performing these tasks, the output bus enables the proto-AGI to interact with its environment and achieve its goals and objectives. The output bus is a crucial part of the proto-AGI’s control and feedback loop, as it allows the AGI to monitor and control the effects of its responses on the environment, and to adjust its responses accordingly. By using the output bus, the proto-AGI can ensure that its responses are delivered correctly, and that they have the intended effects on the environment. This enables the proto-AGI to operate more effectively and efficiently, and to achieve its goals and objectives more successfully.

The repo in its embryonic state is available at: https://github.com/C0deMunk33/TAPAGI