Salesforce Large Data Volumes (LDV)问题案例分析

案例背景】:
前些天接到任务,需要为重复Lead打标记(原始需求是去重),Leads主要来自Web-to-Lead和系统创建,标记重复的条件是Phone和线索所在区域(Global Area)。

项目背景】:
该项目属于金融行业维护型项目,在实施变更项前现存Lead记录有36w+,主要测试项数据来自Web-to-Lead。

实施前考虑】:
1. 数据体量大是考虑的首要条件;
2. Web-to-Lead有个特性,使用Unique自定义字段去重,重复项将不会在sf入库,造成数据丢失;另外任何异常不会显示在website;

问题分析】:
1. 当创建Lead时,首先需要从Trigger.New将用于去重的文本取出来,然后在系统里面去查满足条件的数据,这时候肯定不能使用Before Insert的场景,需要转换成After Insert,原因是在for里面做查询或者在for外层查所有数据封装成Map在for里面取都会触发SOQL限制,因为有36w+数据。
2. 当确定了After Insert场景后,仍然避不开需要在Trigger里做查询,这时在LDV下,如果未优化查询便会出现如下错误:
System.QueryException: Non-selective query against large object type (more than 200000 rows).

EXCEPTION_THROWN|[69]|System.QueryException: Non-selective query against large object type (more than 200000 rows). Consider an indexed filter or contact salesforce.com about custom indexing.
Even if a field is indexed a filter might still not be selective when:
1. The filter value includes null (for instance binding with a list that contains null)
2. Data skew exists whereby the number of matching rows is very large (for instance, filtering for a particular foreign key value that occurs many times)

造成上述异常的SOQL示例如下:

SELECT Id, Phone, Owner_Entity__c FROM Lead WHERE Phone IN ('123') AND Owner_Entity__c IN ('Temple FX') AND Id NOT IN ('00Q0I000010f1Zu')

解决方案】:Improve performance with custom indexes using selective SOQL queries
有类似处理LDV问题经验的人,不难发现,上述文本中就Id字段为标准索引字段,而且用了Not IN,导致优化后查询的数据体量仍然很大,并没有实质性解决问题。要解决这个,有三种方案:
1. 为标准字段Phone加索引 - 这时候需要联系Support解决,需要向Salesforce提Case
2. 为Owner_Entity__c自定义Picklist字段加索引 - 这种类型无法自助加索引,无法勾选External ID;
3. 创建一个新自定义Text类型索引字段,每次创建记录都会使用程序将Phone和Owner_Entity__c拼接起来的文本填充给该字段,然后Trigger里面搜索使用该字段。
对第三点评估
优点:这种方案比较常用,特别适合项目上线初期计划,最明显的优势是可以自助索引;
缺点:对于维护项目就多了跑Batch刷历史数据的工作,像这种36w的数据,还得刷老半天。

当然你也许会想到使用公式字段来管理重复条件取代第三点,而且不需要刷数据,更具有扩展性取代条件的Hard Code,之后随时变更,但是还是有限制的,不仅仅不能自助加索引,而且还限于deterministic formula这种才行

优化后代码】:其中Phone为加索引的字段,标准Email字段默认是加索引的

List<Lead> originals = new List<Lead>();
originals = [SELECT Id, Phone, Owner_Entity__c FROM Lead WHERE Phone IN : phoneSet AND Owner_Entity__c IN :ownerEntitySet];

if(originals.size() > 0)
    for(Lead l : originals)
        if(!newLeadIds.contains(l.Id))
            dupCriteriaSet.add(l.Owner_Entity__c + '>_<' + l.Phone);

【去重效果展示】:

知识盘点】:Salesforce: Index Fields
1. sf有两种类型的索引:标准字段和自定义字段;
2. 标准字段默认索引如下:

  • RecordTypeId
  • Division
  • CreatedDate
  • Systemmodstamp (LastModifiedDate)
  • Name
  • Email (for contacts and leads)
  • Lookups and master-detail relationship
  • Salesforce record ID

其他版本:STANDARD INDEXED FIELDS

The standard Salesforce indexed fields are:

  •  Lead: Company, Email, Lead Owner, Name.
  • Contact: Account Name, Contact Owner, Email, Name, Reports To.
  •  Account: Account Name, Account Owner, Account Record Type, Parent Account,
  •  On top of these, system time fields (last modified, created date etc) and any lookup fields are also indexed.

3. 管理员可以通过为下列数据类型定义为ExternalID或者Unique Field来加索引:

  • Number
  • Auto-number
  • Text
  • Email

4. 限制:每个对象你最多可以为3个字段增加索引,幸运地是你也可以通过提Case让Salesforce Support为确定性字段创建自定义索引。

其他版本:
Certain fields cannot be used as indexes. Examples include: multi-option picklists, currency fields if the org has multi-currency enabled, certain formula fields, binary fields (like a blob or file). In general any complex field type is not a candidate.

5. 为公式字段加索引的条件:Improve performance with custom indexes using selective SOQL queries - Query 4

6. Support不能加索引的类型汇总:Force.com SOQL Best Practices: Nulls and Formula Fields

Since the Winter ’13 release, you have been able to contact salesforce.com Customer Support to create a custom index on a formula field, provided that the function that defines the formula field is deterministic.

Here are examples of common non-deterministic formulas. Force.com cannot index fields that:

  • Reference other entities (i.e., fields accessible through lookup fields)
  • Include other formula fields that span over other entities
  • Use dynamic date and time functions (e.g., TODAYNOW)

A formula is also considered non-deterministic when it includes:

  • Owner, autonumber, divisions, or audit fields (except for CreatedDate and CreatedByID fields)
  • References to fields that Force.com cannot index
    • Multi-select picklists
    • Currency fields in a multicurrency organization
    • Long text area fields
    • Binary fields (blob, file, or encrypted text)
  • Standard fields with special functionalities
    • Opportunity: Amount, TotalOpportunityQuantity, ExpectedRevenue, IsClosed, IsWon
    • Case: ClosedDate, IsClosed
    • Product: Product Family, IsActive, IsArchived
    • Solution: Status
    • Lead: Status
    • Activity: Subject, TaskStatus, TaskPriority

Because values might vary over time or change when a transaction updates a related entity, Force.com cannot index non-deterministic formulas.

7. 最后你也可以参考Query & Search Optimization Cheat Sheet去了解更多的细节,如下图:

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值