DQ22DMP1 001

Telechargé par Alex Halawi
DataFlux Data
Management Studio:
Basics
Course Notes
DataFlux Data Management Studio: Basics Course Notes was developed by Kari Richardson. Editing
and production support was provided by the Curriculum Development and Support Department.
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of
SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product
names are trademarks of their respective companies.
DataFlux Data Management Studio: Basics Course Notes
Copyright © 2012 SAS Institute Inc. Cary, NC, USA. All rights reserved. Printed in the United States of
America. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in
any form or by any means, electronic, mechanical, photocopying, or otherwise, without the prior written
permission of the publisher, SAS Institute Inc.
Book code E2229, course code DQ22DMP1, prepared date 09May2012. DQ22DMP1_001
ISBN 978-1-61290-309-5
For Your Information iii
Table of Contents
Course Description ...................................................................................................................... vi
Prerequisites ............................................................................................................................... vii
Chapter 1 Introduction to DataFlux Methodology and Course Flow ................. 1-1
1.1 Introduction ...................................................................................................................... 1-3
1.2 Course Flow ..................................................................................................................... 1-6
Chapter 2 DataFlux Data Management Studio: Getting Started ......................... 2-1
2.1 Introduction ...................................................................................................................... 2-3
Demonstration: Navigating the DataFlux Data Management Studio Interface ......... 2-6
Demonstration: Creating a DataFlux Data Management Studio Repository .......... 2-20
Exercises .................................................................................................................. 2-26
2.2 Quality Knowledge Base and Reference Sources .......................................................... 2-27
Demonstration: Verifying the Course QKB and Reference Sources ....................... 2-31
2.3 Data Connections ........................................................................................................... 2-33
Demonstration: Working with Data Connections ................................................... 2-36
Exercises .................................................................................................................. 2-53
2.4 Solutions to Exercises .................................................................................................... 2-54
Chapter 3 PLAN ...................................................................................................... 3-1
3.1 Creating Data Collections ................................................................................................ 3-3
Demonstration: Creating Collections of Address and Note Fields ........................... 3-5
Exercises .................................................................................................................. 3-13
Demonstration: Creating and Exploring a Data Exploration ................................. 3-18
3.3 Creating Data Profiles .................................................................................................... 3-32
Demonstration: Creating and Exploring a Data Profile .......................................... 3-39
iv For Your Information
Exercises .................................................................................................................. 3-53
Demonstration: Profiling Data Using Text File Input ............................................ 3-56
Demonstration: Profiling Data Using Filtered Data and an SQL Query ................ 3-61
Demonstration: Profiling Data Using a Collection ................................................ 3-68
Exercises .................................................................................................................. 3-70
Demonstration: Data Profiling – Additional Analysis (Self-Study) ........................ 3-73
3.4 Designing Data Standardization Schemes ..................................................................... 3-83
Demonstration: Creating a Phrase Standardization Scheme .................................. 3-85
Demonstration: Creating an Element Standardization Scheme .............................. 3-92
Demonstration: Importing a Scheme from a Text File ........................................... 3-96
Exercises .................................................................................................................. 3-98
3.5 Solutions to Exercises .................................................................................................... 3-99
Chapter 4 ACT ........................................................................................................ 4-1
4.1 Introduction to Data Jobs ................................................................................................. 4-3
Demonstration: Setting DataFlux Data Management Studio Options ..................... 4-5
Demonstration: Creating and Running a Simple Data Job ..................................... 4-10
Exercises .................................................................................................................. 4-33
4.2 Data Quality Jobs ........................................................................................................... 4-34
Demonstration: Investigating Standardizing, Parsing, and Casing ......................... 4-38
Demonstration: Investigating Right Fielding and Identification Analysis ............. 4-50
Exercises .................................................................................................................. 4-61
Demonstration: Using a Standardization Definition and a Standardization
Scheme .......................................................................................... 4-63
4.3 Data Enrichment Jobs (Self-Study) ................................................................................ 4-75
Demonstration: Performing Address Verification and Geocoding .......................... 4-77
Exercises .................................................................................................................. 4-88
4.4 Entity Resolution Jobs ................................................................................................... 4-90
Demonstration: Creating a Data Job to Cluster Records ......................................... 4-93
For Your Information v
Exercises ................................................................................................................ 4-108
Demonstration: Creating an Entity Resolution Job ............................................... 4-112
Demonstration: Creating a Data Job to Compare Clusters (Optional) .................. 4-138
4.5 Multi-Input/Multi-Output Data Jobs (Self-Study) ....................................................... 4-147
Demonstration: Multi-Input/Multi-Output Data Job: New Products .................... 4-149
Demonstration: Multi-Input/Multi-Output Data Job: Customer Matches ............. 4-159
Exercises ................................................................................................................ 4-166
Chapter 5 MONITOR ............................................................................................... 5-1
5.1 Business Rules ................................................................................................................. 5-3
Demonstration: Creating a Row-Based Business Rule ............................................. 5-6
Exercises .................................................................................................................. 5-11
Demonstration: Adding a Business Rule and an Alert to a Profile ......................... 5-15
Demonstration: Performing a Historical Visualization ........................................... 5-21
Exercises .................................................................................................................. 5-25
Demonstration: Creating a Monitoring Job for a Row-Based Rule ........................ 5-29
Demonstration: Creating Another Data Job Calling the Same Task ....................... 5-38
Exercises .................................................................................................................. 5-41
Demonstration: Creating a Monitoring Job for a Set-Based Rule .......................... 5-42
Exercises .................................................................................................................. 5-48
Demonstration: Viewing the Monitor Dashboard ................................................... 5-50
1 / 466 100%

DQ22DMP1 001

Telechargé par Alex Halawi
La catégorie de ce document est-elle correcte?
Merci pour votre participation!

Faire une suggestion

Avez-vous trouvé des erreurs dans linterface ou les textes ? Ou savez-vous comment améliorer linterface utilisateur de StudyLib ? Nhésitez pas à envoyer vos suggestions. Cest très important pour nous !