{"id":"chatcmpl-BPctVve9zkK7PuUsFvpWbsWzWdR0b","object":"chat.completion","created":1745447409,"model":"gpt-4.1-2025-04-14","choices":[{"index":0,"message":{"role":"assistant","content":"Hello! How can I help you today?","refusal":null,"annotations":[]},"logprobs":null,"finish_reason":"stop"}],"usage":{"prompt_tokens":19,"completion_tokens":10,"total_tokens":29,"prompt_tokens_details":{"cached_tokens":0,"audio_tokens":0},"completion_tokens_details":{"reasoning_tokens":0,"audio_tokens":0,"accepted_prediction_tokens":0,"rejected_prediction_tokens":0}},"service_tier":"default","system_fingerprint":"fp_b38e740b47"}
OpenAI发布的o1是第一个真正大规模商用的Reasoning Model。它和其它模型的区别就是它会在生成回答之前进行CoT(Chain-of-Thought)推理。对STEM理科方面的性能提升特别明显,比如数学。对复杂任务的性能提升也特别显著。OpenAI写了篇文章来讲述它的性能提升Learning to reason with LLMs。
sequenceDiagram
participant Dev as Developer
participant LLM as Model
Note over Dev,LLM: 1: Tool Definitions + Messages
Dev->>LLM: get_weather(location) What's the weather in Paris?
Note over Dev,LLM: 2: Tool Calls
LLM-->>Dev: get_weather("paris")
Note over Dev: 3: Execute Function Code
Dev->>Dev: get_weather("paris") {"temperature": 14}
Note over Dev,LLM: 4: Results
Dev->>LLM: All Prior Messages {"temperature": 14}
Note over Dev,LLM: 5: Final Response
LLM-->>Dev: It's currently 14°C in Paris.
sequenceDiagram
participant Dev as Developer
participant LLM as Model
Note over Dev,LLM: 1: Tool Definitions + Messages
Dev->>LLM: get_weather(location) What's the weather in Paris?
Note over Dev,LLM: 2: Tool Calls
LLM-->>Dev: get_weather("paris")
Note over Dev: 3: Execute Function Code
Dev->>Dev: get_weather("paris") {"temperature": 14}
Note over Dev,LLM: 4: Results
Dev->>LLM: All Prior Messages {"temperature": 14}
Note over Dev,LLM: 5: Final Response
LLM-->>Dev: It's currently 14°C in Paris.
frompydanticimportBaseModel,Field# Define the schema using PydanticclassCapitalInfo(BaseModel):name:str=Field(...,pattern=r"^\w+$",description="Name of the capital city")population:int=Field(...,description="Population of the capital city")response=client.chat.completions.create(model="meta-llama/Meta-Llama-3.1-8B-Instruct",messages=[{"role":"user","content":"Please generate the information of the capital of France in the JSON format.",},],temperature=0,max_tokens=128,response_format={"type":"json_schema","json_schema":{"name":"foo",# convert the pydantic model to json schema"schema":CapitalInfo.model_json_schema(),},},)response_content=response.choices[0].message.content# validate the JSON response by the pydantic modelcapital_info=CapitalInfo.model_validate_json(response_content)print_highlight(f"Validated response: {capital_info.model_dump_json()}")
importjsonjson_schema=json.dumps({"type":"object","properties":{"name":{"type":"string","pattern":"^[\\w]+$"},"population":{"type":"integer"},},"required":["name","population"],})response=client.chat.completions.create(model="meta-llama/Meta-Llama-3.1-8B-Instruct",messages=[{"role":"user","content":"Give me the information of the capital of France in the JSON format.",},],temperature=0,max_tokens=128,response_format={"type":"json_schema","json_schema":{"name":"foo","schema":json.loads(json_schema)},},)print_highlight(response.choices[0].message.content)