Introduction to Data Types

Introduction to PowerQuery Data Transformations
11 minutes
Share the link to this page
Copied
  Completed
You need to have access to the item to view this lesson.
One-time Fee
$69.99
List Price:  $99.99
You save:  $30
€65.22
List Price:  €93.18
You save:  €27.95
£56.04
List Price:  £80.07
You save:  £24.02
CA$96.09
List Price:  CA$137.28
You save:  CA$41.18
A$106.44
List Price:  A$152.07
You save:  A$45.62
S$94.93
List Price:  S$135.63
You save:  S$40.69
HK$547
List Price:  HK$781.46
You save:  HK$234.46
CHF 63.65
List Price:  CHF 90.94
You save:  CHF 27.28
NOK kr765.05
List Price:  NOK kr1,092.98
You save:  NOK kr327.92
DKK kr486.48
List Price:  DKK kr695
You save:  DKK kr208.52
NZ$116.68
List Price:  NZ$166.69
You save:  NZ$50.01
د.إ257.06
List Price:  د.إ367.25
You save:  د.إ110.18
৳7,685.71
List Price:  ৳10,980.05
You save:  ৳3,294.34
₹5,844.71
List Price:  ₹8,349.95
You save:  ₹2,505.23
RM331.80
List Price:  RM474.03
You save:  RM142.22
₦98,423.43
List Price:  ₦140,610.93
You save:  ₦42,187.50
₨19,474.95
List Price:  ₨27,822.55
You save:  ₨8,347.60
฿2,584.73
List Price:  ฿3,692.63
You save:  ฿1,107.90
₺2,255.06
List Price:  ₺3,221.65
You save:  ₺966.59
B$356.31
List Price:  B$509.04
You save:  B$152.73
R1,298.99
List Price:  R1,855.78
You save:  R556.79
Лв127.57
List Price:  Лв182.26
You save:  Лв54.68
₩95,950.21
List Price:  ₩137,077.60
You save:  ₩41,127.39
₪261.40
List Price:  ₪373.45
You save:  ₪112.04
₱4,015.50
List Price:  ₱5,736.67
You save:  ₱1,721.17
¥10,912.42
List Price:  ¥15,589.84
You save:  ¥4,677.42
MX$1,187.01
List Price:  MX$1,695.81
You save:  MX$508.79
QR254.79
List Price:  QR364.01
You save:  QR109.21
P956.45
List Price:  P1,366.42
You save:  P409.96
KSh9,168.69
List Price:  KSh13,098.69
You save:  KSh3,930
E£3,313.80
List Price:  E£4,734.21
You save:  E£1,420.40
ብር4,020.93
List Price:  ብር5,744.43
You save:  ብር1,723.50
Kz58,536.36
List Price:  Kz83,626.96
You save:  Kz25,090.59
CLP$65,519.73
List Price:  CLP$93,603.63
You save:  CLP$28,083.90
CN¥505.79
List Price:  CN¥722.59
You save:  CN¥216.80
RD$4,064.05
List Price:  RD$5,806.04
You save:  RD$1,741.98
DA9,434.16
List Price:  DA13,477.95
You save:  DA4,043.79
FJ$159.43
List Price:  FJ$227.77
You save:  FJ$68.34
Q544.12
List Price:  Q777.35
You save:  Q233.22
GY$14,659.33
List Price:  GY$20,942.80
You save:  GY$6,283.47
ISK kr9,804.19
List Price:  ISK kr14,006.59
You save:  ISK kr4,202.40
DH702
List Price:  DH1,002.91
You save:  DH300.90
L1,236.68
List Price:  L1,766.77
You save:  L530.08
ден4,019.08
List Price:  ден5,741.79
You save:  ден1,722.70
MOP$563.89
List Price:  MOP$805.60
You save:  MOP$241.70
N$1,302.92
List Price:  N$1,861.40
You save:  N$558.47
C$2,577.98
List Price:  C$3,682.99
You save:  C$1,105
रु9,357.62
List Price:  रु13,368.60
You save:  रु4,010.98
S/260.83
List Price:  S/372.63
You save:  S/111.80
K271.34
List Price:  K387.65
You save:  K116.30
SAR262.49
List Price:  SAR375
You save:  SAR112.51
ZK1,913.50
List Price:  ZK2,733.69
You save:  ZK820.18
L324.50
List Price:  L463.60
You save:  L139.09
Kč1,630.34
List Price:  Kč2,329.15
You save:  Kč698.81
Ft25,331.89
List Price:  Ft36,189.97
You save:  Ft10,858.07
SEK kr764.93
List Price:  SEK kr1,092.80
You save:  SEK kr327.87
ARS$61,714.55
List Price:  ARS$88,167.42
You save:  ARS$26,452.87
Bs483.86
List Price:  Bs691.26
You save:  Bs207.40
COP$272,553.83
List Price:  COP$389,379.31
You save:  COP$116,825.47
₡35,845.57
List Price:  ₡51,210.15
You save:  ₡15,364.58
L1,730.57
List Price:  L2,472.36
You save:  L741.78
₲523,213.21
List Price:  ₲747,479.48
You save:  ₲224,266.27
$U2,704.33
List Price:  $U3,863.50
You save:  $U1,159.16
zł280.23
List Price:  zł400.34
You save:  zł120.11
Already have an account? Log In

Transcript

So far in this course, I structured the information in such a way that data types have been picked up pretty much automatically, and we've been able to just run with the information, but that's not always going to be the case. The getting transformed tool treats information differently based upon what type of data is detected within a field. Consider the columns I have here, I have a numbers field a text field, a date field, a date says numbers field, and a numbers this text field numbers have different properties from text values. txt has its own properties and dates have their own properties. numbers can be operated on through addition, subtraction, and a whole host of other commands. While they're ordered based upon their magnitude, and they can be filtered easily by ranges.

Text values work a little bit differently. Text consists of a series of letters, numbers and symbols, which are ordered based upon the alphabetical ordering rules. filters for text are not simply a greater than or less than range, and can consist of regular expressions or other more complex commands. Further, text fields cannot be added or multiplied together, but they do have other commands unique to them. They can be split or combined, searched for and replaced through text can also have a variety of formatting adjustments made such as capitalization, prefixes, suffixes, and more. But that leaves us with dates.

Dates, to be blunt, are simply specialized forms of numbers that have been assembled into a date and time component. Dates themselves are pretty special, and they have a variety of their own constructs. They're defined as a combination of days, months, years, hours, and minutes. And they can also reflect durations. As a result date fields can be treated like numbers, but also sort of like text. And they come with a variety of their own specialized functions.

So now What's the difference between these data types? Well, let's start with dates. Dates can be sorted, oldest to newest or newest oldest, which is sort of like from smallest to largest or largest to smallest. But really, like I said, dates are special numbers. Consider our January 23 20 1712 o'clock AM, you know, midnight, right? Well, there's a number that simply the value 42,758.

That same day at 8am is 42,758.3333. And this way, you could perform some simple calculations on dates as if they were numbers, they still have to be treated differently for their variety of special constructs. What if you wanted to add one hour, or one minute or one year? One year isn't always 365? Just like one month isn't always 30. Thankfully, each of these special cases have different ways of being handled.

But what if we were to change our numbers to text values? Unlike dates, numbers shown us text are still just numbers, but they do behave differently. Consider the numbers here. If I sort the numbers I have from smallest to largest, I get to 1821 21.2, just as you'd expect, they're ordered by magnitude. But what if I order my text numbers alphabetically in alphabetical order, remember the order is defined by the first letter, since one is less than two 150 and 18, come before the value of two. Further since five is less than eight, 150 comes before 18, and so forth.

Simply converting our number into a text field significantly changes how it gets interpreted, rather than the overall magnitude driving the order. The values get ordered based upon their alphabetical properties. I can fix the magnitude order issue by changing my format to add leading zeros doing this each value gets interpreted as the same value along the columns of text. So zero can Compared to 00 gets compared to one, two gets compared to eight, all at the same stage of the number. But this looks really funny. So number and text values clearly need to be differentiated.

Here in Excel, the format is the only real difference between dates and numbers, while the difference between text to numbers is actually engrained in the value itself. In this case, I'm using the text function to convert a number into text. But you can also convert a number into text with an apostrophe. Conveniently, all of my data here is arranged in a table. So I'm going to turn it into a query so that we can play with some of the data type definitions within our get and transform tool. Before we make any of our own changes.

There are two key things to look at when a query is first created. One, there are actually two steps created when most queries are first designed, the sources defined and then the query editor automatically evaluates each field for the most applicable data type and creates the change type step. Second, each field name has an icon next to it to describe what type of data the query thinks is contained in that field. For our numbers field, we can see a 1.2. This indicates that the field is a decimal number. The text field has ABC, which indicates a text data type.

The date field has a calendar with a clock, which indicates a date time combination field. The dates has numbers and numbers as text fields both have a 1.2, indicating that they've been detected as decimal types. Each data type has its own unique icon to make identifying what getting transformed thinks the data is very quick and very easy. Now I'm going to delete my change type command so that we can see how it impacted our information upon the start of this query. Looking at our icons, now, they show ABC and 123 which basically means that they could be anything at this point. The query editor simply has the information, but has not assigned a data type to it.

So it can be any data Type, the change type command that we deleted is a very nice and convenient step, but we can easily recreate it ourselves. And if we hadn't deleted the command, we could always modify it through commands on both the home and transform tabs of the ribbon. On each there are different buttons for defining the data type for a field. The first and easiest to use is on the transform tab. Once here, I can select any column, and I'll select my numbers field, and then I can click the detect data type button. The detect data type will ask the query editor what type of data it thinks is in a column.

In this case, it has taken my numbers field and identified it as a decimal number type. I could do this for all the fields at once by selecting them all and then clicking the detect type and effectively recreating my original change type command. But what if I don't like the data type assigned, for example, my numbers is text. You can manually define data types by clicking on any one of these data types and clicking on the Data Type drop down on the transformer in the Directly above the detect data type button. Alternatively, you can do the same thing with the Data Type drop down on the Home tab. The data types recognized by the editor fall into three general categories numbers, dates, and text, decimal, number, currency whole number are all different number types with various restrictions.

The true false type can be applied to number fields that consist of zeros or ones that represent simple on or off values. Otherwise, I could select any of these number types for my number field and see different results whole numbers strips off any decimals. Currency isn't much different from decimal except that it rounds to four decimal places and of course decimal numbers are standard full number field below the number types are the date types. Date types come in a few different flavors that somewhat mimic the difference between whole numbers, decimals and currency numbers. The date time is the most common in associates a date value and a time value together like our January 23rd 2017 am example we have a date and a time. The next most common is probably the date only value.

The date only value is sort of like the whole number approach, it converts our date time into strictly a date by dropping the time component. converting to a time component is the exact opposite. Doing so drops the date, but keeps the time finally the date time. timezone is a more advanced version of the date time it takes the date and the time and then it adds the time zone for the user into part of the date unless the timezone otherwise specified when working with clock entries from around the globe. This provides some very useful flexibility, but it's something that's quite far beyond the scope of this course. Finally, duration is a date time entry that functions differently from all of the others and that it represents a period of time like a stopwatch as opposed to a set point in time.

Instead of converting these dates to a duration. We'll convert our numbers To iteration to a duration takes a number and converts it into a series of days, hours, minutes and seconds. In this case, the two converts into two days zero hours zero minutes are 21.2 converts into 21 days, four hours and 48 minutes. durations are simply a way of converting a number into a date so that it can be added or removed from another date, which is something that we'll talk about more when we cover dates in detail. The final data transformation we'll cover is the text transformation. The text type is the most generic of the types, and it can be applied to pretty much any field, a text field does not try to interpret the meaning of the values in any way that simply lists the characters in order as represented to the query by the dataset.

If I convert my dates to a text, I still get the date shown, but it's simply represented in a textual way where the computer interprets it as six backslash one four so on rather than as a date value. of June 14, so it gets read and interpreted differently. Similarly, numbers get treated the same way. Remember our example and how they got ordered. It's because instead of being the number 18, it ceases to be 18 and becomes one and then eight. For this course, you won't really need to understand the detailed differences in that comparison.

However, just remember that it's very important that data types match categories. when interacting numbers should interact with numbers, text should interact with text, and date should interact with dates. The final data type in our drop down is binary, and we're not even going to go there, ignore it, pretend it doesn't exist. If you make it to the advanced advanced portions of getting transformed into major database work perhaps then you'll use it until then just ignore it. Data Types simply tell the query editor how to work with values in a given data field. Understanding what rules apply makes a very big difference in what can be done with a field and assists the query editor in guiding your own hand to the appropriate size.

While avoiding errors along the way, the next three chapters will focus on each of the major data type groups. Specifically, we'll start with text fields, move on to number fields, and conclude with date fields. In those sections, we'll outline the transformation functions and some of the other commands available to us and discuss some common pitfalls.

Sign Up

Share

Share with friends, get 20% off
Invite your friends to LearnDesk learning marketplace. For each purchase they make, you get 20% off (upto $10) on your next purchase.