华盛顿大学公开课 Introduction to Data Science 037_pig_evalua
ExampleA= LOAD‘traffic.dat’ AS (ip, time, url); B= GROUP A BY ip; C= FOREACH B GENERATE group AS ip, COUNT(A); D= FILTER C BY ip IS‘192.168.0.1’ OR ip IS‘192.168.0.0’; STORE D INTO‘local_traffic.dat’; LOAD
5/18/2013
Howe, UW Adapted from slides by Oliver Bill Kennedy, U of Buffalo
ExampleA= LOAD‘traffic.dat’ AS (ip, time, url); B= GROUP A BY ip; C= FOREACH B GENERATE group AS ip, COUNT(A); D= FILTER C BY ip IS‘192.168.0.1’; OR ip IS‘192.168.0.0’; STORE D INTO‘local_traffic.dat’; LOAD
GROUP
5/18/2013
Howe, UW Adapted from slides by Oliver Bill Kennedy, U of Buffalo
ExampleLOAD A= LOAD‘traffic.dat’ AS (ip, time, url); B= GROUP A BY ip; C= FOREACH B GENERATE group AS ip, COUNT(A); D= FILTER C BY ip IS‘192.168.0.1’; OR ip IS‘192.168.0.0’; STORE D INTO‘local_traffic.dat’; GROUP
FOREACH
5/18/2013
Howe, UW Adapted from slides by Oliver Bill Kennedy, U of Buffalo
ExampleLOAD A= LOAD‘traffic.dat’ AS (ip, time, url); B= GROUP A BY ip; C= FOREACH B GENERATE group AS ip, COUNT(A); D= FILTER C BY ip IS‘192.168.0.1’; OR ip IS‘192.168.0.0’; STORE D INTO‘local_traffic.dat’; GROUP
FOREACH FILTER
5/18/2013
Howe, UW Adapted from slides by Oliver Bill Kennedy, U of Buffalo
ExampleLOAD A= LOAD‘traffic.dat’ AS (ip, time, url); B= GROUP A BY ip; C= FOREACH B GENERATE group AS ip, COUNT(A); D= FILTER C BY ip IS‘192.168.0.1’; OR ip IS‘192.168.0.0’; STORE D INTO‘local_traffic.dat’; GROUP
FOREACH FILTER STORE
Algebraic Optimization!5/18/2013 Howe, UW Adapted from slides by Oliver Bill Kennedy, U of Buffalo 5
ExampleLOAD A= LOAD‘traffic.dat’ AS (ip, time, url); B= GROUP A BY ip; C= FOREACH B GENERATE group AS ip, COUNT(A); D= FILTER C BY ip IS‘192.168.0.1’; OR ip IS‘192.168.0.0’; STORE D INTO‘local_traffic.dat’; FILTER
GROUP FOREACH STORE
Lazy Evaluation: No work is done until STORE
5/18/2013
Howe, UW Adapted from slides by Oliver Bill Kennedy, U of Buffalo
ExampleLOAD Create a MR job for each COGROUP
FILTER Map Reduce
GROUP FOREACH STORE
5/18/2013
Howe, UW Adapted from slides by Oliver Bill Kennedy, U of Buffalo
ExampleLOAD 1) Create a MR job for each COGROUP
FILTER Map Reduce
GROUP FOREACH STORE
2) Add other commands where possible
Certain commands require their own MR job (e.g., ORDER)
5/18/2013
Howe, UW Adapted from slides by Oliver Bill Kennedy, U of Buffalo
Review NoSQL–“NoSchema”,“NoTransactions”,“NoLanguage”– A“reboot” of data systems focusing on just high-throughput reads and writes– But: A clear trend towards re-introducing schemas, languages, transactions at full scale Google’s Spanner system, for example
Pig– An RA-like language layer on Hadoop– But not a pure relational data model–“Schema-on-Read” rather than“Schema-on-write”
5/18/2013
Bill Howe, UW
…… 此处隐藏:689字,全部文档内容请下载后查看。喜欢就下载吧 ……相关推荐:
- [实用模板]第八章:法国“新浪潮”与“左岸派”
- [实用模板]2021年北京上半年临床医学检验技师生物
- [实用模板]SAP GUI 7.10客户端安装配置文档
- [实用模板]2001年临床执业医师资格考试综合笔试试
- [实用模板]36机场工作实用英语词汇总结
- [实用模板](一)社会保险稽核通知书
- [实用模板]安全教育主题班会材料
- [实用模板]濉溪县春季呼吸道传染病防控应急演练方
- [实用模板]长沙房地产市场周报(1.30-2.3)
- [实用模板]六年级数学上册典中点 - 图文
- [实用模板]C程序设计(红皮书)习题官方参考答案
- [实用模板]中国证监会第一届创业板发行审核委员会
- [实用模板]桥梁工程复习题
- [实用模板]2011学而思数学及答案
- [实用模板]初中病句修改专项练习
- [实用模板]监理学习知识1 - 图文
- [实用模板]小机灵杯四年级试题
- [实用模板]国贸专业毕业论文模板
- [实用模板]教育学概论考试练习题-判断题4
- [实用模板]2015届高考英语一轮复习精品资料(译林
- 00Nkmhe_市场营销学工商管理_电子商务_
- 事业单位考试法律常识
- 诚信教育实施方案
- 吉大小天鹅食品安全检测箱方案(高中低
- 房地产销售培训资料
- 高一地理必修1复习提纲
- 新概念英语第二册lesson_1_练习题
- 证券公司内部培训资料
- 小学英语时间介词专项练习
- 新世纪英语专业综合教程(第二版)第1册U
- 【新课标】浙教版最新2018年八年级数学
- 工程建设管理纲要
- 外研版 必修一Module 4 A Social Surve
- Adobe认证考试 AE复习资料
- 基于H.264AVC与AVS标准的帧内预测技术
- 《食品检验机构资质认定管理办法》(质
- ABB变频器培训课件
- (完整版)小学说明文阅读练习题及答案
- 深思洛克(SenseLock) 深思IV,深思4,深
- 弟子规全文带拼音