Social Networks: Getting Distributed Web Services Done with NoSQL

本文探讨了社交网络面临的挑战及解决方案,特别是针对Twitter规模的数据处理。文中详细介绍了使用NoSQL存储技术,如Redis、Voldemort和Hazelcast来解决大规模活动流处理的问题。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

http://www.infoq.com/presentations/Social-Networks-NoSQL

 

monolithic single service, synchronous

Asynchronous Services

php-amqp (http://code.google.com/p/php-amqp/)

Activity Stream

Social Network Problem(Twitter Problem)

• >15 different Events
• Timelines
• Aggregation
• Filters
• Privacy

 

18M Events/day sent to ~150 friends
=> 2700M timeline inserts / day
20% during peak hour
=> 3.6M event inserts/hour - 1000/s
=> 540M timeline inserts/hour - 150000/s

 

• Nginx + Janitor
• Embedded Jetty + RESTeasy
• NoSQL Storage Backends

 

 

Schema:

Message → [distribute uid, body]

(index)recipient_time_line → [[itemId, time, type]

(index)sender_time_line → [itemId, time, type]

• Identify Users with recipient lists >{limit}
• Only push updates with recipients <{limit} to MRI(recipient_time_line)
• Pull special profiles and users with >{limit} from ORI (sender_time_line)
• Identify active users with a bloom/bit filter for pull

 

Activity Filter

Scan full day of updates for16M users on a per minute granularity for 1000 friends in < 100msecs

 

NoSQL: Redis

• Fast in memory Data-Structure Server
• Easy protocol
• Asynchronous Persistence
• Master-Slave Replication
• Virtual-Memory (>250 GB of RAM needed, swaps less frequented values to disk)
• JRedis - The Java client (No consistent hashing, No rebalancing, Pipelining support)

• Persistence - AOF(less memory hungry) and Bgsave (for additional backups)

 

NoSQL: Voldemort

• Key-Value Store
• Replication
• Versioning
• Eventual Consistency
• Pluggable Routing / Hashing Strategy
• Rebalancing

• Pluggable Storage-Engine
• Reduce the size of the BDB append log
• Balance Client and Server Threadpools

• Choose a big number of partitions

 

NoSQL:Hazelcast

• In Memory Data Grid
• Dynamically Scales
• Distributed java.util.{Queue|Set|List|Map}and more
• Dynamic Partitioning with Backups
• Configurable Eviction
• Persistence

• IDs of Stream Entries generated via Hazelcast

• Nodes get ranges assigned (node1: 10000-19999, node2: 20000-29999 ID's)
• IDs per range locally incremented on the node (thread safe/atomic)
• Distributed locks secure range assignment for nodes

 

Start benchmarking and profiling your app early!

 

 

名词大洗底:

Nginx : 反向代理服务器 , 正向代理(forward proxy ) out-bound traffic 代理,比如ISP提供商的代理,将用户(比如我等宽带用户)的请求forward到internet的目标服务器,可能会缓存网页,改善性能. 反向代理是在服务器端加入一个代理层, 将请求分发到到不同的主机。因此反向代理可作为load balancer

 

Janitor: Computer Janitor is a tool that lets you clean up a system so it's more like a freshly installed one ?

 

AMQP : 高级消息队列协议,这是一个可以和 JMS 进行类比的消息中间件开放规范,所不同的是 AMQP 同时定义了消息中间件的语意层面和协议层面;另外一个不同是 AMQP 是语言中立的,而 JMS 仅和 Java 相关。AMQP 在“语意层面的定义”,这就意味着,它并不仅仅是象 JMS 或者其他的 MQ 一样,仅能按照预定义的方式工作,而是“可编程”的消息中间件

 

RabbitMQ : Erlang 实现的 MQ

 

Jetty : java jsp servlet的一个轻量级(?)容器,据说比Tomcat, JBoss支持更好的高并发

 

RESTeasy : jax-ws的一个实现,同JBOSS有着很好的继承;同时有client端和server端的web service 框架

 

voldmort : Linked 使用的分布式key-value存储系统, java实现, Pluggable serialization, Storage-Engine, Routing / Hashing Strategy (三个关键模块都是pluggable)

 

hazelcast : In Memory Data Grid, Distributed java.util.{Queue|Set|List|Map} (? 相当有趣)


Redis :c语言的 key-value store,支持数据结构的直接存储(比如list, set, ordered set)


 

 

 

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

FireCoder

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值