Mastering ETL Testing with Chris Thompson and Mike Calabrese

14 July 2024, 10:58 AM

By Test Guild

Chris Thompson and Mike Calabrese TestGuild AutomationFeature 2 guests

About This Episode:

Today, we're diving deep into the complexities of ETL testing with our esteemed guests, Chris Thompson and Mike Calabrese from RTTS.

With over two decades of combined experience in data QA and automated testing, Chris and Mike will guide us through the intricacies of an actual data warehouse project plan.

They uncover the challenges in data mappings, explore different transformation types, and discuss the critical role of a comprehensive mapping document.

Chris and Mike also highlight the importance of rigorous data testing to prevent costly errors and ensure accurate decision-making.

Join us as we explore the critical aspects of ETL testing and data project planning and gain invaluable insights from two industry veterans who have seen it all. This is a must-listen episode for anyone involved in data testing and looking to optimize their data transformation processes.

I also recommend you check out our upcoming Webinar with RTTS on AI in ETL Testing. Register now

Exclusive Sponsor

Discover TestGuild – a vibrant community of over 34,000 of the world's most innovative and dedicated Automation testers. This dynamic collective is at the forefront of the industry, curating and sharing the most effective tools, cutting-edge software, profound knowledge, and unparalleled services specifically for test automation.

We believe in collaboration and value the power of collective knowledge. If you're as passionate about automation testing as we are and have a solution, tool, or service that can enhance the skills of our members or address a critical problem, we want to hear from you.

Take the first step towards transforming your and our community's future. Check out our done-for-you services awareness and lead generation demand packages, and let's explore the awesome possibilities together.

About Chris Thompson

Chris Thompson

Chris Thompson has a distinguished background that includes a BS in computer science from the University of Delaware. He is also a military veteran who served as an aviation electronics technician in the U.S Navy. Chris has had successful implementation of QA projects in the data space for over 15 years, and he has been working at RTTS the past 21 years. Chris started with RTTS as an entry level test engineer and has worked in various fields including Pharmaceutical, Utilities and Retail.

Connect with Chris Thompson

- LinkedIn: www.chris-thompson-238a0a4/

About Mike Calabrese

Mike Calabrese

Mike began his career in 2009, when he joined RTTS as a Test Engineer. He now has over a decade of experience successfully implementing automated functional, data validation, and ETL testing solutions for multiple clients across many industry verticals. Mike is a technical expert on QuerySurge, RTTS' flagship data testing solution, and he supports clients around the world with their QuerySurge implementations. Mike graduated from Hofstra University with a Bachelor of Science in Computer Engineering.

Connect with Mike Calabrese

- Company: RTTS: QuerySurge Software Division
- LinkedIn: www.michael-calabrese-rtts

Rate and Review TestGuild

Thanks again for listening to the show. If it has helped you in any way, shape, or form, please share it using the social media buttons you see on the page. Additionally, reviews for the podcast on iTunes are extremely helpful and greatly appreciated! They do matter in the rankings of the show and I read each and every one of them.

Transcript

Download New Tab

[00:00:00] In a land of testers, far and wide they journeyed. Seeking answers, seeking skills, seeking a better way. Through the hills they wandered, through treacherous terrain. But then they heard a tale, a podcast they had to obey. Oh, the Test Guild Automation Testing podcast. Guiding testers with automation awesomeness. From ancient realms to modern days, they lead the way. Oh, the Test Guild Automation Testing podcast. With lutes and lyres, the bards began their song. A tune of knowledge, a melody of code. Through the air it spread, like wildfire through the land. Guiding testers, showing them the secrets to behold. Oh, the Test Guild Automation Testing podcast. Guiding testers with automation awesomeness. From ancient realms to modern days, they lead the way. Oh, the Test Guild Automation Testing podcast. Oh, the Test Guild Automation Testing podcast. With lutes and lyres, the bards began their song. A tune of knowledge, a melody of code. Through the air it spread, like wildfire through the land. Guiding testers, showing them the secrets to behold.

[00:00:35] Joe Colantonio Hey, today we're gonna be diving deep into the complexities of ETL testing with our guest, Chris Thompson and Mike Calabrese from RTTS. With over two decades of combined experience and data QA and automated testing, Chris and Mike are going to guide us through the intricacies of an actual data warehouse project plan they share with me at an RTTS event we did last year. They uncovered the challenges of data mapping, exploring different transformation types, and discussed the critical role of a comprehensive mapping document. And Chris and Mike also highlight the importance of rigorous data testing to prevent costly errors and ensure accurate decision making. So join us as we explore the critical aspects of ETL testing and data project planning, and gain valuable insights from two industry veterans. This is a must listen episode to anyone involved in data testing and looking to optimize their data transformation process. And since this presentation, RTTS has also introduced AI into their ETL process and they'll share how you can do the same in our upcoming Test Guild webinar exclusive this week called leveraging AI to Simplify and Speed Up ETL testing. Register now. Even if you missed the event or can't make it, because I'll be providing the recordings after the event to see AI and ETL in action by going to testguild.me/AIETL. Hope to see you there.

[00:01:59] Chris Thompson Hi. Welcome everybody. Welcome to the Test Guild event for QueryeSurge. So today in my segment I'm going to be talking about creating a project plan for a data warehouse project. A little bit about myself, my name is Chris Thompson. I'm a senior domain expert, data testing practice, and military veteran, I spent five years in United States military flying inside these things called P-3 Orion's Sub Hunter. I have bachelor's degree in computer science from the University of Delaware. I've done successful implementations of QA projects in the data space for well over 15 years. I've been with RTTS for 21 years, and I started with RTTS as an entry level test engineer coming directly out of college. I worked in numerous fields, including pharmaceutical, utilities, and retail. And that's me. And here we have Mike. Mike?

[00:02:51] Mike Calabrese Yeah. Thanks, Chris. So I joined RTTS as a test engineer back in 2009. So during that time, I've gained over a decade of experience, successfully implementing both automated testing solutions as well as data and ETL validating testing solutions, technical expert on QuerySurge, our flagship data testing solution, and currently helping clients around the world with their QuerySurge implementations. And I have a Bachelor of Science from Hofstra University in computer engineering.

[00:03:20] Chris Thompson All right. Thanks, Mike. All right. So I'm going to give you a little bit of an introduction into ETL testing and specifically for creating a project plan around data warehouse testing. Data testing is an integral part of our development of any data project including data warehouses, data migrations and integration projects. Bad data from defects can cause companies to make decisions that could cost millions of dollars or in health related fields, could cost dearly or somebody their lives. One of my first clients was a big pharmaceutical company. I had no idea what data testing was. I was sent there as a functional automation engineer with some background in Rational Robot, for those that remember that tool. And also Mercury WinRunner, well, they have Mercury WinRunner and I was trying to test data warehouse with Mercury WinRunner and yeah, it didn't go too well. I ended up having to use a lot of SQL, and I actually did use WinRunner to execute that SQL bring back results into a very antiquated database called, Microsoft Access. And that's where I would do the comparisons using Minus Queries, which we'll talk about here shortly. Minus Queries. We started off using WinRunner and using Minus Queries. And it was very, inefficient, took a long time to execute and could not complete our tasks doing it that way. So eventually we built a utility in VB, which eventually became QuerySurge. But we'll get into that a bit later, a little bit more about the reasons why we want to test data is Walmart handles more than 1 million customer transactions every hour. Data is imported into databases that contain greater than 2.5PB of data equivalent to 160 times the information contained in all the books in the U.S. Library of Congress. Additionally, Facebook 40 million photos. Twitter with 85 million tweets per day, Google one terabyte per hour, and eBay 80TB per day.

[00:05:28] Mike Calabrese There's a lot of data out there.

[00:05:30] Chris Thompson What is the data source that. Is source pool of data available for extraction. The concept of data source is technologically neutral. It's not associated with any specific technology. The most common data sources are databases, files, and XML data documents. But it could be just about anything that we can get our hands on. In my example of the pharmaceutical company, we were doing adverse events, which is, say to Tylenol, got an annual nosebleed or something, and you went to the doctor. The doctor would enter that information into an adverse event system, which would be the source system that would eventually end up in the pharmaceuticals data warehouse. All that data would be collected, and there could be many, many different systems from different states and different hospitals that would eventually need to get translated and loaded into a master data warehouse for that particular company. What is a data warehouse? Is a collection of data and information intended to support business decision making. Data warehouses contain a wide variety of data that presents a coherent picture of business conditions. Data warehouse is a huge repository and electronically organized data mainly meant for the purpose of reporting and analysis. Most data warehouses are sent data from multiple sources such. My example came from a hospital, come from a doctor affiliate somewhere. Many different places it could be files come in as files come from an Oracle database source that could come from proprietary software or different types of sources. It's a place where historical data is stored for archival analysis and security purposes. All right. What is ETL? ETL is a term that describes extract transform and load refers to the data handling process that involves extracting data from outside sources, transforming data for operational reporting needs, and then load that data into an endpoint, usually a database. In this case, for today's example, we'll be doing a data warehouse. Why ETL? Well, businesses need to load the data warehouse regularly incrementally, daily, weekly so they can serve process of supporting business analysis. Basically we take a whole bunch of sources from different places. We put it in the same type of format so that we can run reporting across all those sources. So it's transforming the data into one set standard. Here's an example of that. So on the left side we have a whole bunch of source systems. They go through an ETL process and eventually get to a target data warehouse. And another example we have multiple systems that could be different. We have an inventory system, a customer service system, and an advertising system, all sort of asking different questions. The paint came off my widget, the customer service that would end up in a data warehouse, which then could end up in data marks and then be pulled out by BI tools to say, we have hundreds and hundreds of widgets where the paint is falling off. All right. The concept of test points or ETL legs and ETL leg refers to a single ETL process that moves transforms data between two discrete points. From source the target is a test point to select, a full ETL process may have multiple legs. In my previous slide you saw that there was a data warehouse and a data mark. So each one of those is a separate leg. And we'll show that in the next slide. Test points are usually across single ETL legs. The verifications between the source and the target for that leg. And I'll talk about that from the next slide as well, in more detail. An example is an operational source database source test point is extracted central load into data warehouse. Testing is conducted across this leg here. Here's an example of the different types of test points. In the example above it's a multi. So what we're going to do is we're going to say our data source over here. And then we're going to test all the way across to our data mart. We're going to test all these transformations in one leg. And then on the bottom example it's breaking it into its separate entities. And both have advantages and both have disadvantages. But the single approach has in my opinion more advanced. In a single leg, we have more tests that need to be created because it's more legs. So the same tables and the same data needs to be tested across each individual ETL leg, of the test the less complex because you could have transformations occurring across multiple legs. And if you were going to test the whole thing as a whole, you would be creating that transformation logic in your SQL to be able to test from source all the way to its final destination. Defects are easier to pinpoint. So you can imagine in the multi leg approach, if there are 6 or 7 legs that occur and you're testing them all in one go and you find a defect or an error or a bug where in that 6 or 7 leg process did that defect occur? So it's more difficult. You have to dive a little more deeper into the data, and you may end up having to create tests in the single leg approach anyway, just to find that defect. Additionally, for the single leg execution time tends to be longer. This is a little bit of a disadvantages, but it's not always the case. We have more tests, more tests to execute. Even though they're automated, they still take computational time and hardware to be able to execute those. And then the multi leg just quickly there's less testing needs to be created, obviously. Tests are more complex like I discussed defects some more difficult to pinpoint like I discussed. And execution time tends to be shorter because there's less tests.

[00:11:16] Mike Calabrese As it usually, Chris, you would say what are the other or could there be sort of a mixed approach here where you'll do both single and multiple leg tests.

[00:11:24] Chris Thompson You could definitely have a mixed approach.

[00:11:27] Mike Calabrese That's what I usually find too. The projects that I worked on.

[00:11:30] Chris Thompson Some of the direct mappings with a particular column. Does it have any transformations at all across all the legs to its final destination? It might be worthwhile just to create the test. Just test those 1 to 1 mappings across the entire thing. And that way you don't have to create those simple tests at each leg of the testing process. Now I'm gonna talk about a data mapping document. The data mapping document is frequently called the source target map is generally created in a spreadsheet. This document acts as a central part of the functional requirements following information contained within the document. Normally a mapping document would contain your source database information such as your source table and source column. This same thing with your target, your target table, target column, any transformation logic, and any optional types of requirements.

[00:12:19] Mike Calabrese Definitely one of the most important documents when you're doing a data testing project.

[00:12:24] Chris Thompson Yes, this is it. This is what you need to get to be able to really test an ETL process correctly. I've been on client sites where they don't even have a mapping document, and I have to reverse engineer what they've done to be able to determine what they actually did, to come up with a mapping document on my own, then to get business approval, that is the way it's supposed to be, because the development team had done it through verbal techniques of creating what was supposed to happen. You seen that before, Mike?

[00:12:56] Mike Calabrese I've definitely seen that, Chris.

[00:12:57] Chris Thompson Anyway, here's an example of a segment of a mapping document. We're going to get into this a little bit more detail, some of the transformation types that you see. And this is not all inclusive, but direct maps which are just 1 to 1 mapping where the column and target are the column from the source to the target goes through no transformation whatsoever, it's just a direct pull. Select the column and row type, this is where we might do some filtering to bring only specific columns of specific rows over into the target. Translation, this is where we translate one thing into another. We could have say in the source side we could have state names like new Jersey, New York, etc. and then target, we have NJ and NY, so that would be a translation translating the name into it's abbreviation in this case. Lookup is where we take a particular item. It's similar to translation, but we look it up in another table to figure out what the value should be. Transposing was where we take columns and turn them into rows or rows and columns. We don't see it very often, but I have seen it in the past, so we put it on the list. Field splitting and example field splitting is take and the source side, we have a full name column where Chris Thompson is in that field. And we're going to split that into two columns, one for first name and one for last name. The first name Chris, last name Thompson. So one column into two. Field merging is exactly the opposite saying in the source side we have Chris Thompson, but it's in two columns, one for field, one for the last name, and we put them together in the target side. Calculated in the right fields. These could be very, very complex to relatively simple. Take for example a sales database. And I want to aggregate into the data warehouse particular value say for the month of August or sales dollar amount or the sales dollar amount for a particular month. We would calculate that and load that into the target. And that would have to be tested to make sure those calculations are correct.

[00:15:11] Mike Calabrese Now, some of these seem more complex than others, and I have a feeling that's going to become important.

[00:15:17] Chris Thompson Absolutely. How did you guess, Mike? So some testing methods. I'm not going to go into them all or in any detail. But there's the what we like to call stare and compare. So we run some SQL or we export stuff into Excel. And we look at it side by side and make sure the fields were migrated correctly. Obviously this is very error prone to human error. Also, how much data can you really test doing it that way? The second method that I'll talk about is the Minus Queries. So this is where you take the source minus target and you end up with a result. And the result would be the difference between the two. However, if there are columns in the target and you do source minus target, you will not see those columns or rows. You would have to do it correctly, you got to do source minus target and then union target minus source to get the full difference. This is highly time consuming and taxing on hardware.

[00:16:19] Mike Calabrese It's like double the work.

[00:16:22] Chris Thompson Correct it is double the work, Mike. So QuerySurge comes in to play here. Automation with QuerySurge. And so this is bulk data verification testing sample sizes up to 100%. We have lots of clients that do that. We have management of our test assets within QuerySurge. It stores all your tests code. We call them query pairs. But the QuerySurge stores them all. And you can also group them together into what we call test suites, test scheduling. So say, every day, at midnight, an ETL batch job occurs and new data ends up in my data warehouse. And I just want to test that data, but I want to execute it when the ETL completed and I can schedule to say, run QuerySurge at 3:00 in the morning and run this group of tests so that when I come in in the morning at, 7:00, I have instant results waiting for me. Persistent access to test data, QuerySurge stores in the same database for results sets. So you can go back in time to see what occurred, previous executions of your tests and robust reporting to see, how many test passed, how many failed over time, and automated data testing approach for QuerySurge can significantly improve coverage for organization and efficiency with parts of the previous mentioned manual testing techniques. All right, enough with QuerySurge. Let's move on to. All right. So what we came here to do is to show you a simple process, for a simple project, creating a project plan. So some of the things that we're going to need is we're going to need to gather some project documents. So that mapping document of course, the most important one. We're going to have to gather any other supporting requirement documents data model documents. We're going to have to estimate the time it's going to take us to review all that documentation and determine the number of test engineer resources, determine the number of ETL or test legs, determine the number of cycles or releases. Additionally, which is probably the hardest part of this whole process is determine the complexity of each mapping. We have low complexity, which this is no standard, it's just what we think is the best approach is low complexity has no transformation logic. They're all 1 to 1 mappings or have very minor transformation logic, including a change to data types from source to target, such as VarChar being converted to an integer, selected row filtering, and minor translations. We also have medium complexity and transformation logic, including translations, joints across tables, fill splitting and field merging. And then we have our high complexity transformation logic, including major translations, multiple joints calculated the aggregate fields, transposing derived fields, and match and merge. We also need to, of course, is installed and configured for the project, and thus the lead and test engineers require training. Thanks. I'm going to now jump over to our mapping document for today, which comes from our QuerySurge tutorial. This is available for anybody to download the QuerySurge trial. They will get the documentation for the tutorial as well as this mapping document. There's a bit of a story of this tutorial sample companies coming over to a data warehouse. We have to two companies, the Zcity and Xsmart. And the data is being ETL into my SQL target data warehouse. For today, we're just going to do the project plan for this leg here is the Zcity part being detailed over into the target data warehouse. Here is the data model for the Zcity. Very complex. What do you think, Mike? Three whole tables?

[00:20:17] Mike Calabrese I mean, it seems like a lot, but you know, given the time frame you have, maybe it's all we could do right now.

[00:20:22] Chris Thompson I agree. We have three tables purchased the sales and sales item. And we have these my SQL data warehouse which has-.

[00:20:32] Mike Calabrese Now we're talking.

[00:20:33] Chris Thompson Now we're talking. Dims and facts a little bit more advanced of a schema. And here is our lovely Zcity mappings for our data warehouse. We have mappings which is just a mapping number, a source table, a source column, the source type transformation logic, the target table, target column, and target type. And we also have some comments. There's additional logic that needs to be applied and example of what the data looks like. We're just going to go through each one of these and talk about the translation type that is occurring as well as what we think are a low complexity, medium complexity, and high complexity. We'll look at the first one, Mike. We're looking at auto sequence with the target table customer dim. It's an ID, customer ID. It's also a primary key, Mike. And it's an integer. What kind of translation type do we think that could be?

[00:21:29] Mike Calabrese I'm kind of leaning towards low complexity here is where we just incrementing things here.

[00:21:35] Chris Thompson And we're just an auto sequence. So we're just going up 1234567. On the target side.

[00:21:41] Mike Calabrese I'm going to say between low to medium. I can't decide. I don't know.

[00:21:44] Chris Thompson Let's see what I guess. I said translation medium. Mike, what do you think?

[00:21:48] Mike Calabrese You know what that I'm thinking about that kind of makes sense. We'll have to apply some tests to make sure that the auto sequences is being done properly. Or we're seeing this is a primary key to. We're going to make sure there's no duplicates in your test. We're going to have to get a little bit more complex than just a straight count and direct map. I agree.

[00:22:08] Chris Thompson All right. Good. Because I was going to override anyway. All right. All right. Now we got 101. This is a direct map. So it's pretty simple. It's a VarChar to a VarChar. And so Mike?

[00:22:21] Mike Calabrese I'm going to guess low. It's a direct map.

[00:22:24] Chris Thompson You sure it's not a high one?

[00:22:24] Mike Calabrese No it's definitely not.

[00:22:25] Chris Thompson Oh you're right. It's a direct map. All right. So now we can go into something a little bit more interesting. So here we have the example that I've been talking about during the slide presentation is the full name field in the source column. And then the target data warehouse. It's two columns first and last. And so we extract the data customer then that first from source purchaser.full name by splitting full name by the space and taking the first substring. All right. What do you think, Mike?

[00:22:54] Mike Calabrese Well, so we have a full name column. That means the whole name is in the source side. And we're going to split out the first name here. We definitely have some transformation logic. I'm going to say that's a medium. I mean there are some logic here, but maybe it's not that complex. Something that we can code around here.

[00:23:10] Chris Thompson That's like a field splitting and medium. And similarly for the last name field, it's going to be the exact same thing. Field split in medium. All right. What about the purchaser date joint field being a direct map. Except there is a join, Mike. There's a join in the customer dim table or the date dim table. And the customer dim join date ID, dim id and pull date dim ID.

[00:23:36] Mike Calabrese But I mean you see direct map I almost always stay low complexity. But you're adding some complexity here with this join. I'm going to categorize this. I'm going to guess medium here. We got a medium.

[00:23:47] Chris Thompson All right. Additionally, I will continue on going through this process with Mike. And eventually we've come to an agreement on everything. For these next few I will pull up something a little bit more complex, which is these the one down here actually purchases are bill same. On the source system, we have columns for both the billing address and right here is the shipping address and the billing address. However, you will notice in the data warehouse we have a single table that contains all the addresses. And then on the customer dim table we have the shipping address id and the billing address id. Okay. So this has to be an ETL over and it's a little bit more complicated of a thing to do. There's some logic here and I won't read it. And I don't want to bore anyone, but if anybody wants to read it take a look at our tutorial data. But obviously this is a bit more complicated than your standard one, as we are doing one thing for one way and it's the bill and shipping address of the same and something different if they're both different. What do you think, Mike?

[00:25:00] Mike Calabrese Well, since you haven't done it yet, I'm going to guess this is high complexity, but also because of the fact of this transformation logic that we see here, it looks quite complex developing a test around it. I think we'll be pretty involved. I'm going to guess high here. Well let's agree on it's a high. I mean if we were going to side in ourselves.

[00:25:19] Chris Thompson I would say it for you right now it's still white and I like the whole thing. That's okay. All right. It's high. All right. And efficiently, to save some time, we've gone through this together, and we would come up with our results across the board for this entire table, as well as these other tables that I could not hide from for. I have three tables from the source side coming over in the data warehouse, and we've figured out the complexity for each one. And now I can move on to the next part of the project planning creation services. Now back to the creation of the project plan. We now have some information on our complexity of our tests. I additionally, pre-filled out some information as well. We have, I estimated four hours to review the documentation. We think Mike, I think that's good.

[00:26:15] Mike Calabrese Yeah, I think so.

[00:26:16] Chris Thompson I'm gonna be the lead, and selected Mike to be the test engineer, so. All right, so now back to the creating project plan. We now have some of the numbers that we need. For reviewing the documentation I'm going to put that four hours the number of test engineers. Mike's going to be the test engineer for today. I have one I will be the lead. The number of ETL legs, it's just one. There's no intermediary legs, the number of release and cycles we've estimated for. We have seven low complexity tests, 21 medium complexity tests, and 8 high complexity tests. All right. So we now have the information we need to create a project plan. We're not going to create it from scratch. QuerySurge is building a tool called the QuerySurge project plan estimator tool, it's built in Alchemer also known as Survey Gizmo. This is in beta format and this is the first time it's been publicly displayed, but it will automatically create our project plan for us based on the information that we've already got, and we can provide it. I can put my name pretty quickly in here. I'll use my RTTS email, and Mike, I'm going to do.

[00:27:31] Mike Calabrese You've got it, Chris. It's your show.

[00:27:34] Chris Thompson I click next. Next the project estimator inputs. That gives you some basic information of what you need to do. We've already discussed most of it, so I won't go into in detail. We are doing the Zcity data warehouse project DWH. Estimate full time. So the project documentation and assets we set 4, what is the total number of test engineers? 1. How many test legs? We set, 1. Estimated total number releases. We set 4. What's the total number of low complexity I believe we set 7, and the medium complexity 21. How many highs do we have, Mike?

[00:28:12] Mike Calabrese It was 8.

[00:28:13] Chris Thompson I think you're right. I think it was 8. This QuerySurge need to be installed and configure. Now for me it's already installed and already configured. There's the test lead required training. Now I am an expert but Mike is not an expert. So he needs training. You can also set a start date. But set to blank today. And then I'll click next. And on these calculations automatically occur. This is based these numbers and calculations are based off of QuerySurge and RTTS is best practice. And what we find occurs on client sites and for customers. We go through different phases. Our first phase is test strategy. And here are all the tasks associated with doing a test strategy. Analyze the documentation, define ETL legs, define environments, define release schedules, define high level project schedule, define defect work flow, create test strategy document and QuerySurge training and then breaks it out into how many hours for the lead, how many hours for the engineer.

[00:29:13] Mike Calabrese Let's see if it get to work, a lot of work here, Chris.

[00:29:15] Chris Thompson I do have a lot of work on that stuff. Phase two is test planning. Identify data validation test. This is where we identify. We sort of already done that to some extent during our previous talk when we talked about the mapping document. Define and assign resources. In this case, it'll be easy because I'm going to assign everything to you, Mark.

[00:29:36] Mike Calabrese Well so far you've got most of the work. So I'm feeling pretty good.

[00:29:43] Chris Thompson Define the detail schedule. Install the QuerySurge and create a test plan. Yeah. You're right. You do have a lot of hours going on there. It's the next phase though, Mike. It's where the work changes. The test design create low complexity test or a medium complexity test, creating a high complexity test. Verify queries, verify Query pairs test suite and review and oversight. I only have 8 hours and so on, Mike.

[00:30:06] Mike Calabrese I can get a lot of results.

[00:30:08] Chris Thompson I don't know if it's accurate that I may have to change that calculation based on who's doing the work.

[00:30:14] Mike Calabrese Well, in this case, you can, probably take that number down.

[00:30:17] Chris Thompson And then execution analysis phase 4, similar execute query pairs of test suites, create medium complexity test queries, create high complexity, verify queries for query pairs of test suite for review and oversight. And when I click this next button. It'll create a high level project plan for QuerySurge with some basic terminology and overview of project assumptions. All the tasks we just talked about, along with all the hours associated with each phase. And then we'll add up all the different phases for each of the lead and test engineer hours. And it looks like you got more work than me.

[00:30:57] Mike Calabrese Yes. So it worked out that way.

[00:30:59] Chris Thompson Resourcing days rounded up. So we have 16 days for me, 21 days for the engineer. And in total, it's going to take the engineer. It's going to be the roadblock here. So that's going to take 21 days or 4.2 weeks. And that is about it. This tool actually be available to the public and to our customers from QuerySurge.com. I expect some more information to come in in the coming months.

[00:31:28] Mike Calabrese And it's, having gone through putting these together, I think estimating the timelines are one of the hardest things to do, especially at the outset of a project. Hopefully you learned some things today from Chris. And he's insight here on to how to sort of break down the mappings and the translations and transformations that you'll be seeing. And this tool definitely will help with that.

[00:31:49] Joe Colantonio Thank you, Chris and Mike for your ETL awesomeness. And once again, you don't want to miss this upcoming webinar on how to use AI with ETL to make it even better for testing, all you need to do is go to testguild.me/aietl and for everything of value we covered in this episode, head on over to testguild.com/a505. All right, so that's it for this episode of the Test Guild Automation podcast. I'm Joe, my mission, as always, is to help you succeed in creating full stack automation awesomeness. As always, test everything and keep the good. Cheers!

[00:32:23] Thanks for listening to the Test Guild Automation podcast. Head on over to Testguild.com for full Shownotes amazing blog articles and online testing conferences. Don't forget to subscribe to the Guild to continue your testing journey.

Scroll back to top

Playwright + AI, Faster Migrations, Smarter Tests and More TGNS176

Posted on 12/01/2025

About This Episode: What tool is trying to give testers more control over ...

AI-Driven Manual Regression: Test Only What Truly Matters With Wilhelm Haaker and Daniel Garay

Posted on 12/01/2025

About This Episode: Manual regression testing isn’t going away—yet most teams still struggle ...

Top Automation Guild Survey Insights for 2026 with Joe Colantonio

Posted on 11/24/2025

About This Episode: About This Episode Automation Guild turns 10 this year, and ...