Tag

Explore by tags

All

30u30

ASR

ChatGPT

GNN

IDE

RAG

agent-skills

ai

ai-agent

ai-api

ai-api-management

ai-client

ai-coding

ai-demos

ai-deploy

ai-development

ai-framework

ai-image

ai-image-demos

ai-inference

ai-leaderboard

ai-library

ai-rank

ai-serving

ai-tools

ai-train

ai-video

ai-workflow

AIGC

algorithms

alibaba

amazon

android

anthropic

audio

aws

biology

blog

book

bytedance

chatbot

chatgpt

chemistry

claude

claude-code

cli

code

codex

copilot

course

cursor

deepmind

deepseek

depth

devops

diffusers

docker

drug-discovery

electron

embeddings

engineering

evaluation

facebook

finance

foundation

foundation-model

gemini

gemini-cli

gemma

genomics

gitHub

github

go

google

gradient-booting

grok

groq

huggingface

image

ios

java

javascript

json

LLM

llm

mLOps

math

mcp

mcp-client

mcp-server

meta-ai

meta-pytorch

microsoft

mlops

mobile

multilingual

multimodal

mysql

NLP

nlp

nodejs

numpy

nvidia

ocr

ollama

openai

opencode

pandas

paper

physics

pi

plugin

polars

postgres

privacy

prompt-engineering

pwa

python

pytorch

qwen

react

RL

robotics

rust

science

security

shodan

skillkit

sora

speech

sqlite

ssh

stt

swe

tensorrt

terminal

transformers

translation

tts

tutorial

typescript

vibe-coding

video

vision

vllm

voice

xAI

xai

EdgeBench

2026

ByteDance Seed

Measures how autonomous AI agents learn via long-horizon, feedback-rich executable tasks; publishes 51 public tasks from a 134-task suite and provides SForge, a two-container evaluation harness for iterative 12+ hour runs to track learning trajectories.

evaluation ai-agent bytedance llm huggingface+2

Morphing into Hybrid Attention Models

2026

Fudan University, ByteDance Seed +1

Disen Lan, Jianbin Zheng +6

Treats hybrid layer selection as a budget-constrained subset optimization and introduces FlashMorph: a pipeline that equips each transformer layer with a linear-attention branch, jointly optimizes layerwise gates on synthetic long-context retrieval data, then discretizes, distills, and finetunes—achieving strong long-context recall using only 20M selection tokens.

paper transformers llm nlp qwen+2