Introduction¶
BlindLlama provides Confidential & transparent APIs for querying open-source models, which:
Serves models in a Confidential & transparent hardened environment. Even our admins cannot access or view any user data!
Provides robust technical proofs that users are communicating with an authentic BlindLlama server deployed within a hardened environment. This means users have technical guarantees that we cannot access or use their data!
We provide a Python package which you can easily download and use to query our models. Currently, we serve the Llama2 70b model, but we will serve more popular open-source models in the near future.
Llama2 is is a text-generation LLM (large language model) that can be queried in a similar way to OpenAI's ChatGPT.
It does not yet have the full security features. Do not test our API with confidential information... yet!
You can follow our progress towards the next beta and audit-ready versions of BlindLLama on our roadmap!
Let's now take a look at how you can query Llama2 with BlindLlama.
You can follow along in your own environment or online using our Google Colab notebook.
Getting your access token¶
Before you can use our API, you will need to get your personal access token from Mithril Cloud.
You can get started with our API for free, but note that the number of free queries per user is capped.
Installation¶
Next, you'll need to install:
- the
blind_llama
Python client tpm2-tools
, this library is used by the client to verify the server
!pip install blind_llama
!apt install tpm2-tools
Querying the model¶
Now you're ready to start playing with our privacy-friendly Llama 2 model!
First of all, you'll need to import the blind_llama
package and then copy and paste your API key into the corresponding API_KEY
variable in order to be able to use our API.
BlindLlama's client SDK is based on that of openai
to facilitate uptake for end users are already familiar with their SDK.
Our querying method completion.create()
accepts three options model
, prompt
and temperature
of which only prompt
is compulsory.:
The
prompt
option is a string input containing your query input text. Feel free to modify theprompt
option below to test the API with new prompts!The
model
option allows you to select the model you wish to use and is set to Llama-2-70b by default. We currently only support this model, but will add more models in the near future.The
temperature
is the sampling temperature and should be a value between 0 and 1. The higher the value, the more random the output will be. The lower the value, the more deterministic it will be. Thetemperature
option is set to 0 by default. This will make the model use log probability to automatically increase the temperature until certain thresholds are hit.
The completion.create()
method returns the response of the model as a string.
import blind_llama as openai
# set your Mithril Cloud API key
openai.api_key = "YOUR_API_KEY_HERE"
# query model and save output in response string
response = openai.completion.create(
model="meta-llama/Llama-2-70b-chat-hf", # set model with HuggingFace model ID
prompt="Describe the Python programming language",
temperature=0.7, # set sampling temperature to generate relatively random output
)
print(f"Result:\n{response}") # print out response
/home/laura/BlindLlama/env/lib/python3.10/site-packages/urllib3/connectionpool.py:1095: InsecureRequestWarning: Unverified HTTPS request is being made to host 'aicert_worker'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#tls-warnings warnings.warn( /home/laura/BlindLlama/client/blind_llama/completion.py:216: UserWarning: The quote from the TPM is not endorsed by the Cloud provider for the alpha version of BlindLlama v0.1. For more information look at https://github.com/mithril-security/blind_llama warnings.warn(f"The quote from the TPM is not endorsed by the Cloud provider for the alpha version of BlindLlama v0.1. For more information look at https://github.com/mithril-security/blind_llama")
Result: in 30 words or less. Python is a versatile, high-level programming language that emphasizes readability and ease of use, featuring a minimalist syntax and flexible built-in data structures.
Security¶
While on the face of it, the API appears to works just like a regular AI API, BlindLlama is doing a lot under the hood to make sure user data remains confidential!
When you connect to the BlindLlama server, the client will:
- Check that it is talking to an authentic BlindLlama server, through attested TLS
- Check that the server is serving the expected code and is deployed withing a hardened Confidential & transparent environment
If either of these checks fail, you will see an error and will be unable to connect to the server!
⚠️ Note that some security features will not be implemented until the beta or 1.0 launch dates. You can check out our progress on our roadmap.
You can check out our overview of how we make our APIs confidential in the next section and learn more about the underlining key concepts in our concept guide.
We also created the BlindLlama whitepaper to cover the security features behind BlindLLama in greater detail. You can read or download the whitepaper here!