August 5, 2024

LLM Benchmarking on the Nosana grid

LLM Benchmarking on the Nosana grid

Intro

The Nosana grid contains about two thousand nodes with various hardware configurations, which are actively running AI models. At the start of the Nosana test phase, these nodes have mostly been running image generation or transcription jobs through Stable Diffusion and Whisper. Although these jobs are suitable to make sure nodes are functioning properly, they do not provide any additional benefit from an AI use case perspective.

So to make the best use of the nodes until the launch of the mainnet, at the beginning of 2024 we started looking for opportunities to run jobs that would be useful to the Nosana community. As Large Language Models (LLMS) are projected to be the center pillar of Nosana’s AI demand in the foreseeable future, we decided to hire a dedicated AI specialist team to start working on a large-scale LLM Benchmarking project on the Nosana grid. This project aims to provide information that will help clients make better informed decisions, help the Nosana team implement a fairer market system, and contribute valuable information to the LLM research community. In this blogpost, we will go over the required fundamentals to understand how benchmarking works, and then show how we can use the results of the benchmarks to create fair markets.

LLM Fundamentals

Let’s start with the fundamentals of LLMs. What is an LLM? How does it work? And what do we need to run one?

Architecture

For anyone reading up on LLMs they might seem like complex neural networks used in artificial intelligence. While this is true to some extent, in practice, LLMs essentially consist of two easy to understand files. The model weights file, and the model code file. The model weights file contains the parameters of the model and determines the model size which is measured in bytes. The model code file contains the instructions on how to load and run the model.

LLM

When looking at llama3.1–70B, the identifier 70B means the model contains 70 billion parameters. The parameters of the model are stored with 16-bit floating point precision, which equals to 2 bytes, making the model weights file size 140 gigabytes.

Each parameter in the model weights file corresponds to a neuron in the architecture described in the model code file. For most modern day LLMs this architecture is called a transformer. The image below shows a generalized transformer architecture used for producing text.

LLM

A detailed explanation of transformers is beyond the scope of this article, so we will focus only on the part that is the most important to understand for this benchmarking research. For a LLM to output text it needs to perform computations at every layer, and to do this, it needs to have its parameters and specific cached computations loaded into memory at the respective layers in the model.

Inference

Now that we know what an LLM is, let’s see how we actually produce language with them. LLMs are trained on the task of next token prediction. Tokens are units of text that correspond to words or parts of words, and they are the vocabulary that is understood by LLMs. So as far as LLMs are concerned, producing language is nothing more than correctly predicting the next word or subword given the preceding ones. This process of producing tokens with LLMs is called inference. The speed of inference is an important factor in the usability of LLMs, and it is influenced by the model size, architecture, and the hardware & software configuration on which it is run.

So how does inference work? To facilitate inference, LLMs go through 2 main stages. The prefill phase and the decoding phase.

In the prefill phase, the model processes all input tokens simultaneously to compute all the necessary information for generating subsequent tokens. In practice, the duration of this phase corresponds to the time you wait until the LLM starts generating its response. The prefill phase is highly parallelized and makes efficient use of the computing capacity, passing through the model only once.

At the start of the decoding phase, the model uses the cached information computed during the prefill phase to generate a token. From this point on, for every newly generated token, the previous token needs to pass through the network together with the cached computations. This process of repeatedly going through the network is not computationally intensive as computations only have to be performed for a single token. Instead, the decoding phase is memory intensive, because the cached information has to be moved around to perform the necessary computations.

Phases

The prefill phase is compute bound, while the decoding phase is memory bound. A process is considered compute-bound when it requires significant computation and its speed or performance is limited primarily by the amount of processing power of the hardware. A process is memory bound when its performance is limited by the rate at which data moves to and from memory. This rate is called the memory bandwidth.

Hardware Requirements

Alright, so we need compute and memory capacity to run LLMs. This is where our GPUs come in. Let’s figure out what we would be able to run with the Nosana grid’s most popular GPU, the RTX 4090. When considering GPUs, the available memory is expressed in VRAM, the memory bandwidth is expressed in bytes per second, and the computational capacity is expressed in FLOPS, floating point operations per second. The RTX 4090 has 24 GB VRAM, a memory bandwidth of 1,008 GB/s, and can produce 82.58 teraFLOPS.

The hard requirement for running an LLM is having enough VRAM to store its parameters + cached computations. For llama3–8B with 16-bit floating point precision, this would approximately be 18GB of VRAM. Our RTX 4090 will be able to handle that.

Figuring out if a model can be run is fairly straightforward, but figuring out how it will perform is hard to determine as it is dependent on many variables. That being said, given the constraint of a memory bound process we can give a rough theoretical estimation of the amount of tokens per second by dividing the model size by the memory bandwidth.

18GB ÷ 1,008 GB/s = 18ms

This calculation gives us 18ms per token or 56 tokens per second. So this calculation would give a rough estimation for an inference run with a single input sequence, which is a predominantly memory-bound process. However, if we would increase the amount of sequences processed at the same time, it could shift from memory-bound to compute-bound and the amount of FLOPS of the GPU would start to play a more important role.

Optimization

We have covered the fundamentals of LLMs and understand which variables are important for running them. By using optimization techniques we can tweak these variables resulting in beneficial tradeoffs that will allow us to run bigger models, or increase inference speed. There are many knobs to turn when it comes to optimization, but the three most important ones are quantization, computation enhancement and caching strategies.

Quantization reduces the precision of the model’s weights to lower bit representations. For example, if we would use quantization to transform the 16-bit floating point precision of our llama3.1–8B model to 8-bit integer, then we would reduce the model size from 16GB to 8GB. While this can have an impact on the accuracy of the model, the significant decrease in memory usage makes it a viable optimization technique.

Llama 3.1

Computation enhancement focuses on optimizing the operations performed within the model such as the attention mechanism. This mechanism is an important part of the transformer’s success, but also computationally expensive. By modifying the order of computations or by fusing together certain model layers we can reduce the data that needs to be written from and to memory.

Caching strategies involve the reduction of the cached computations that are kept in memory. By simplifying the structure of the cached computations it is possible to significantly reduce the memory footprint in exchange for a slight decrease in model accuracy.

Inference Frameworks

We have briefly gone over the main optimization techniques and it is already becoming apparent that implementing any potentially desired change might become complicated. Luckily, there are various LLM inference frameworks that provide an interface to a wide range of models with built-in options for optimization. As the field of LLM inference is rapidly evolving, these frameworks roll out updates frequently and there is no clear-cut best optimization framework.

One such framework is Ollama. We will give it a special mention here because it is the framework that was used to gather the initial benchmarking results. Ollama originated as a user-friendly framework with the goal of democratizing the use of LLMs. The Ollama team has impressively succeeded in achieving this goal, as it is undoubtedly the easiest framework for anyone with minimal hardware requirements to spin up an LLM. It is especially fitting for running and testing consumer grade hardware as its optimization techniques seamlessly allow models of any size to be run on GPUs with any amount of VRAM.


Nosana Benchmarking

Enough preliminaries. It is time to get into the actual benchmarking. Lets start off by going over the data we collected, and how we collected it.

Benchmarking Setup

As mentioned, Ollama was picked as the initial framework for benchmarking due to its compatibility with consumer grade hardware. With this framework in place, we implemented a custom made benchmarking script to gather data in two distinct categories, model performance and system specifications.

The performance data contains variables on the total amount of produced tokens and how long it took to produce them, but also the clockspeed and the wattage of the GPU. The system specifications data contains variables on an extensive set of system configurations that can have either large or small effects on model performance. The tables below illustrate the kinds of variables and their potential values.

Specs
Performance

With the to be collected variables defined, we now had to pick a model for benchmarking. Given the notable performance of the newly launched llama models and the Nosana nodes with a wide variety of VRAM capacities, we decided to pick llama3.1–8B that can fit on all GPUs.

Now for the actual procedure of benchmarking we had to create a method compatible with the current job posting structure of Nosana. A job has a maximum length of X hours, giving us plenty of time to load in one or more models, prompt them, and measure their performance. During a job every model got prompted with randomly sampled sequences such as “Write a step-by-step guide on how to bake a chocolate cake from scratch”. The content of a prompt does not have an influence on the performance of the model, but it does determine the length of the response, so to make sure the LLMs spent most of their time on actual inference the pool of prompts we made encouraged longer answers. At the end of each job the output contains the model performance and system specs variables which we extracted and added to a large dataset.

Evaluation

Before we get into the results, let’s quickly look at the key evaluation metrics for LLM inference, inference speed and time to first token (TTFT). The inference speed is measured in tokens produced per second during the decoding phase, and largely determines how long a user has to wait for a full response. The TTFT is a measurement of the time a user has to wait for the first response token, which is a crucial component in the usability and desirability of many LLM applications.

In this first article, where we test consecutive single user queries, we will mainly focus on inference speed as a measure of performance, not the TFTT. The TTFT measurement is dependent on the pre-fill phase, which is a compute bound process as we mentioned. In our setting where we process single queries at a time, the amount of computations needed during the prefill phase is low, resulting in uniformly low TTFTs for all GPUs. Inference speed on the other hand, which is a measurement of how fast the decoding process is executed, is heavily dependent on memory bandwidth. There is a large variety of memory bandwidth capacity between the GPUs on the Nosana grid, so focusing on the inference speed metric will provide us with the most insightful observations.

As a final note on our evaluation procedure, it is worth mentioning that most LLM inference benchmarking research focuses on the hardware’s capacity to handle concurrent users. When dealing with concurrent users, the model has to handle multiple queries at the same time. This setting makes it possible to maximally utilize the GPUs memory and computational capacities, especially for enterprise hardware with large amounts of VRAM. We have deliberately chosen to perform this initial benchmark with consecutive single queries, or 1 concurrent user, to define a setting in which we can evaluate all GPUs on the Nosana grid. This makes it so that our results will help us identify well performing and underperforming nodes across all markets, which enables the implementation of a fair market structure. However, the current setup does not benchmark the actual maximum capabilities of the nodes, which will be a topic in one of our upcoming articles.


Results

The dataset we used contains information on 10,596 jobs performed by 550 unique nodes in 15 Nosana Markets. Within these nodes there are 39 unique types of GPUs. The RTX 4090 and 3090 are the most common GPUs by far with 122 and 101 counts respectively.

Market Performance

As an initial goal of our benchmarking research we set out to create fair markets. In the presented results we have aggregated the data on market level, so we can show the performance for each market and highlight opportunities for improvements.

Performance

In the above visual we see the average tokens per second for each market. All the way at the top we have the H100 with 111 tokens per second, and at the bottom we have the RTX 4060 with 42 tokens per second. At first glance, this graph does not indicate anything out of the ordinary. On average the general trend seems to be, the more expensive the GPU the better the performance.

Performance

When we look at the performance variation within markets, indicated by the black bar, we get some more interesting findings. The larger the variation within a market the more varied the performance of different nodes within a market is. Varied performance within markets is undesirable for both clients and node runners. When clients use a Nosana node for inference compute, they want reliable performance suitable for their application. When node runners provide compute, they want to be paid based on the quality of compute they provide. A high variance within markets interferes with both of these objectives.

So having completely fair markets would mean that we would have a variance of 0 within all of them. However, getting to 0 variance within each market would require drastic solutions that impede the functionality of the Nosana grid. Fortunately though, designing the markets to minimize performance variance is something we can do. For example, we can implement a minimum performance threshold based on the average of each market. Every node performing worse than the average minus the performance threshold would be removed from the market. This would not only reduce the variance of the market, which is caused predominantly by underperforming nodes, but also increase the average tokens per second within the market.

The visual below illustrates what would happen to the market if a 20% or a stricter 10% threshold were to be implemented.

Performance

As we can see, for the markets with high variation, the threshold causes a significant increase in performance. This is because the threshold removes a larger amount of underperforming nodes within these markets. As a result, these markets become fairer because they provide more reliable compute to clients, and payout similar amounts for similar quality compute.

Performance Monitoring

After we observed the variance between nodes, we started analyzing variables that cause performance fluctuations. Even though we used a large set of hardware specs for our analysis, the results were unambiguous and pointed at two main factors responsible for performance, GPU type & Wattage. GPU type is the foundation that determines the performance range of nodes, but the wattage plays an arguably more crucial role by determining the location within this range. For example, a 3070 GPU running at full power can outperform a 4090 GPU that’s not getting enough, showing how proper wattage allocation can be just as important as the GPU model itself.

Now with this knowledge we are able to categorize 3 types of node runners that deviate from the expected performance. Spoofers, malignant node runners that fake hardware configurations & performance. Underclockers, economically greedy node runners that do not provide enough power to their hardware setup. And a third category of node runners with unforeseen technical issues. As the Nosana team we want to remove spoofers, keep underclockers in check, and help any node runner facing technical difficulties. Due to the unfakeable nature of model inference performance, monitoring this metric helps us identify which category of underperforming node runner we are dealing with, and take appropriate measures to balance the markets.

Node Leaderboard

As a first practical step towards fair markets we introduce the Nosana Node Leaderboard. Here we track the performance of each node within the market and display relevant hardware configurations. This allows us, together with the community, to monitor the performance of Nosana nodes in a transparent way. Go check it out!


Next Steps

By doing research using the Nosana Grid we aim to accomplish two main goals. First, to create the optimal Nosana experience by incorporating data-driven insights into our decision making. Second, to contribute valuable research findings to the broader large language model community. In this article we mainly focused on the first goal, as the current benchmark results are practically useful for the Nosana Grid, but do not provide information about the maximum performance capacity of specific model-hardware combinations in realistic settings. In our next article, we’ll explore maximum performance in real-world conditions.

Stay Updated with Nosana

Get the latest insights on AI infrastructure, GPU launches, and network innovations — all in one place

Catch Up on Nosana's Recent Blogs

Run your AI jobs across a decentralized GPU grid. No lock-ins, no downtime, no inflated cloud bills just pure compute power, when you need it.

The New Nosana Experience Is Live
March 13, 2026 |

The New Nosana Experience Is Live

Today marks a major step forward for Nosana.

Empowering African Languages with AI: How Christex and Geneline-X Use Nosana to Build Inclusive Voice Models
March 5, 2026 |

Empowering African Languages with AI: How Christex and Geneline-X Use Nosana to Build Inclusive Voice Models

Artificial intelligence is reshaping education, communication, and economic opportunity, but only for the languages and communities it supports.

Nosana Grants Program Welcomes AiMo Network
March 3, 2026 |

Nosana Grants Program Welcomes AiMo Network

Nosana is pleased to welcome AiMo Network as an official Nosana Grantee through the Nosana Grants Program.

Nosana Monthly - February Edition
March 2, 2026 |

Nosana Monthly - February Edition

From launching the Nosana Learning Hub, to expanding real GPU supply through OpenGPU, rolling out infinite restart strategies by default, and partnering with Sallar and Alio, the Nosana GPU Marketplace is scaling across infrastructure, tooling, and ecosystem integrations.

Nosana 🤝 OpenGPU: Expanding Access to AI Compute
February 5, 2026 |

Nosana 🤝 OpenGPU: Expanding Access to AI Compute

The infrastructure behind artificial intelligence is changing rapidly. As demand for GPU power continues to rise, so does the need for more open, efficient, and accessible computing solutions.

🚀 January on Nosana: Milestones, Momentum & What’s Next
January 30, 2026 |

🚀 January on Nosana: Milestones, Momentum & What’s Next

January was one of those months where you pause for a second, look at the numbers, the people, the product and realize just how much ground has been covered.

December Recap: Closing the Year in Motion
December 30, 2025 |

December Recap: Closing the Year in Motion

December didn’t just close the year, it validated the network! Real GPU workloads, builders shipping in production, and milestones that matter!

Introducing @nosana/kit, the comprehensive 2.0 toolchain for Nosana
December 23, 2025 |

Introducing @nosana/kit, the comprehensive 2.0 toolchain for Nosana

Comprehensive toolchain for managing jobs, markets, runs, and protocol operations on the Nosana compute network.

Nosana 2025: From Testnets to Real-World Compute
December 23, 2025 |

Nosana 2025: From Testnets to Real-World Compute

In 2025, Nosana reached a point of maturity where experimentation gave way to production and decentralized compute shifted from an emerging idea into dependable infrastructure.

The Heart of Nosana: Nosvember 2025 Recap
December 18, 2025 |

The Heart of Nosana: Nosvember 2025 Recap

As the dust settles on another unforgettable Nosvember, it’s clear once again: the Nosana community is the heart of everything we do.

The Nosana Grants Program: Fueling the Next Wave of AI Builders, Vibers, and Dreamers
December 10, 2025 |

The Nosana Grants Program: Fueling the Next Wave of AI Builders, Vibers, and Dreamers

Access $5K-$50K in funding, compute credits, and decentralized GPU infrastructure to build the next generation of AI products.

Agent 102 Recap: MCP, Mastra, and the Next Wave of AI Builders
December 4, 2025 |

Agent 102 Recap: MCP, Mastra, and the Next Wave of AI Builders

Agent 102 our third Builders’ Challenge, pushed the bar higher and our builders cleared it with style.

Nosana Monthly - November Edition
December 1, 2025 |

Nosana Monthly - November Edition

A month of community, builders, and next-gen AI.

Visual Command Center: Managing Deployments with Nosana's Dashboard
November 20, 2025 |

Visual Command Center: Managing Deployments with Nosana's Dashboard

Part 2 of our deployment series: Discover how our new dashboard makes managing distributed deployments as intuitive as clicking a button.

Nosana’s Spare GPU Capacity Is Now Powering Scientific Research
November 12, 2025 |

Nosana’s Spare GPU Capacity Is Now Powering Scientific Research

Nosana’s spare GPU power now fuels Folding@Home, advancing global biomedical research and showcasing the real-world impact of decentralized compute.

Nosana Monthly - October Edition
November 10, 2025 |

Nosana Monthly - October Edition

This month has marked a major step in Nosana’s journey. We’ve expanded into new regions, launched new tooling, partnered with leading ecosystems, and brought hundreds of builders into the decentralized AI future.

From Proposal to Vote: How NNP-0001 Will Be Decided
November 5, 2025 |

From Proposal to Vote: How NNP-0001 Will Be Decided

This post explains timeline, eligibility, and the voting procedure so every holder knows how to participate.

Nosvember Games: A month of celebration for the Nosana Community!
November 3, 2025 |

Nosvember Games: A month of celebration for the Nosana Community!

With November ahead, we’re bringing back Nosvember — a full month dedicated to the Nosana community.

From Yield to Growth: Aligning NOS Rewards with Real Usage!
October 22, 2025 |

From Yield to Growth: Aligning NOS Rewards with Real Usage!

The first Nosana Network Proposal NNP-001 Tokenomics is live. The proposal has a simple goal to make NOS rewards work harder by funding what grows the network.

Elevating the Deployment Experience: Introducing Nosana's New Deployment Manager
October 16, 2025 |

Elevating the Deployment Experience: Introducing Nosana's New Deployment Manager

This is the first article in our technical series exploring how we're revolutionizing deployments on the Nosana network.

Builders Challenge - Agents 102
October 10, 2025 |

Builders Challenge - Agents 102

Build intelligent AI agents with Mastra and deploy them on Nosana's decentralized network. Compete for $3,000 USDC in prizes!

Nosana Expands Across Asia: Powering the Future of AI Infrastructure
October 1, 2025 |

Nosana Expands Across Asia: Powering the Future of AI Infrastructure

Asia: the fastest-growing hub for AI and Web3

How We're Helping AI Startups Cut Costs by 67% With Open-Source Models
August 7, 2025 |

How We're Helping AI Startups Cut Costs by 67% With Open-Source Models

Nosana helps AI startups dramatically reduce operational costs by replacing expensive proprietary AI models with optimized open-source alternatives.

Agent 101 Recap: How Builders Took on the Nosana Challenge
July 18, 2025 |

Agent 101 Recap: How Builders Took on the Nosana Challenge

Agent 101 was our second Builders’ Challenge, a call to action for devs to build smart, scalable AI agents that run on Nosana’s decentralized GPU network. And the community more than delivered.

Builders Challenge - Agents 101
June 25, 2025 |

Builders Challenge - Agents 101

Second edition of the Nosana Builders's Challenge, build and deploy Agents — and compete for over 3,000 USDC in prizes

Builders Challenge - Create a Nosana Template
March 31, 2025 |

Builders Challenge - Create a Nosana Template

This is your chance to showcase your skills, gain visibility, learn new tools — and compete for over 3,000 USDC in prizes**

Introducing Swapping and Priority Fees
February 11, 2025 |

Introducing Swapping and Priority Fees

Introducing Nosana's newest features, in-Dashboard token swapping and dynamic priority fees.

Nosana's GPU Marketplace is Open to the Public
January 14, 2025 |

Nosana's GPU Marketplace is Open to the Public

Today marks a major milestone for Nosana as we officially open our GPU Marketplace to the public.

2024 at Nosana: A Year In Review
December 27, 2024 |

2024 at Nosana: A Year In Review

With the Mainnet launch just weeks away, it feels like the right time to reflect on the milestones that have defined 2024.

Road to Mainnet: Nosana's Next Chapter
December 23, 2024 |

Road to Mainnet: Nosana's Next Chapter

The Nosana Test Grid is now production-ready, paving the way for the upcoming launch of the Nosana Mainnet.

Test Grid Phase 3: final steps to mainnet
September 30, 2024 |

Test Grid Phase 3: final steps to mainnet

Today Nosana’s Test Grid has successfully transitioned to its third and final phase. This is an exciting time, as the final core components for Nosana’s Main Grid will be rolled out and tested.

LLM Benchmarking: Cost Efficient Performance
September 13, 2024 |

LLM Benchmarking: Cost Efficient Performance

Explore Nosana's latest benchmarking insights, revealing a compelling comparison between consumer-grade and enterprise GPUs in cost-efficient LLM inference performance.

Nosana Team is Heading to Singapore for Solana Breakpoint and Token2049
September 11, 2024 |

Nosana Team is Heading to Singapore for Solana Breakpoint and Token2049

The Nosana team is heading to Singapore for Solana Breakpoint and Token2049 to connect with builders and innovators in the DePIN and AI sectors.

Nosana Staking Program Update
May 21, 2024 |

Nosana Staking Program Update

To ensure the network's continued success and long-term potential, we're implementing a key update to our staking program.

Nosana at Solana Hacker House Dubai 2024
April 9, 2024 |

Nosana at Solana Hacker House Dubai 2024

Our core team is heading to Solana Hacker House Dubai edition to connect with builders and innovators in the DePIN and AI sector.

Test Grid Phase 2 Update
April 3, 2024 |

Test Grid Phase 2 Update

An update on our plans for Test Grid Phase 2

How AI Inference Drives Business Applications in 2024
March 8, 2024 |

How AI Inference Drives Business Applications in 2024

AI inference bridges the gap between complex AI models and their practical use cases.

Testing the First GPU Grid for AI Inference
February 5, 2024 |

Testing the First GPU Grid for AI Inference

Nosana has successfully tested the first decentralized GPU grid developed and customized for AI inference workloads.

Exploring the Distinctions Between GPUs and CPUs
January 30, 2024 |

Exploring the Distinctions Between GPUs and CPUs

Initially devised for graphics rendering in gaming and animation, GPUs now find applications well beyond their initial scope.

An In-depth Exploration of AI Inference: From Concept to Real-world Applications
January 24, 2024 |

An In-depth Exploration of AI Inference: From Concept to Real-world Applications

In this third chapter of the Nosana Edu series, we'll break down how AI inference works, explore its fundamental concepts, and discuss how it's impacting businesses and industries.

Nosana's Strategic APY Adjustment for Balanced Growth and Stability
January 12, 2024 |

Nosana's Strategic APY Adjustment for Balanced Growth and Stability

Aligning Long-term Success with Sustainable Rewards

Deep Learning Unveiled: Navigating Training, Inference, and the GPU Shortage Dilemma
January 11, 2024 |

Deep Learning Unveiled: Navigating Training, Inference, and the GPU Shortage Dilemma

Right now this field is facing a big problem: there aren't enough GPUs

Nosana 2023: Pioneering AI and GPU Computing
January 2, 2024 |

Nosana 2023: Pioneering AI and GPU Computing

With the demand for AI inference showing no signs of slowing, our commitment in 2023 centered on scaling up new capacity and expanding our offerings

Deep Learning Demystified
December 28, 2023 |

Deep Learning Demystified

A Comprehensive Guide to GPU-Accelerated Data Science

Navigating a Sustainable Future in Tech: The Nosana Initiative
December 15, 2023 |

Navigating a Sustainable Future in Tech: The Nosana Initiative

Addressing the GPU Shortage with a Sustainable Lens

Test Grid Phase 1: Accelerating the AI and GPU Computing Revolution
December 1, 2023 |

Test Grid Phase 1: Accelerating the AI and GPU Computing Revolution

The launch of our Test Grid represents a significant moment in AI and GPU-compute technology

Unlock the Earning Potential of Your GPU: How to Monetize Your Hardware with Nosana
November 28, 2023 |

Unlock the Earning Potential of Your GPU: How to Monetize Your Hardware with Nosana

If you have an underutilized GPU gathering dust, it's time to turn it into a source of revenue

Nosana Launches Incentivized Public Test Grid with 3 Million $NOS
November 17, 2023 |

Nosana Launches Incentivized Public Test Grid with 3 Million $NOS

A multi-phase program that will further power the AI revolution.

Nosana's $NOS Rewards Farm on Raydium!
November 15, 2023 |

Nosana's $NOS Rewards Farm on Raydium!

Are you ready to expand your $NOS stack? Let's get started!

BreakPoint 2023: Bridging the Global GPU Shortage
November 9, 2023 |

BreakPoint 2023: Bridging the Global GPU Shortage

We're building the world's largest decentralized compute grid by directly connecting GPUs and AI users

Nosana's New Direction: AI Inference
October 13, 2023 |

Nosana's New Direction: AI Inference

GPU-compute grid for AI inference