- Why and how are column store databases more optimized for aggregations?
Ans: Since all the values of a particular column are stored in contiguous memory locations, the read is faster.
- How does HANA compress the data? Types of compression methods?
Ans: There are three ways HANA compresses data. They are Dictionary Encoding, Run-Length Encoding and Cluster Encoding. But, mostly Dictionary encoding is used by HANA as it is the most optimized.
- When you need to read all the columns of the databases table, is Hana better than the traditional database?
Ans: Traditional database is better in this case. As the time needed to reconstruct the entire row for all the data in the table is more.
- Can we use row based tables in modeling? If not how can we use?
Ans: We can’t use the row based tables directly in the Graphical or script views. But, the table can be used in a procedure and then the procedure can be used in the script based calculation view.
- What will be the default schema name when the data is moved to HANA using SLT method?
Ans: The name of the replication configuration in the SLT.
- What is index server? What does it do?
Ans: The main SAP HANA database management component is known as the index server, which contains the actual data stores and the engines for processing the data. The index server processes incoming SQL or MDX statements in the context of authenticated sessions and transactions.
- What is a key attribute in attribute view? Why is it needed?
Ans: Key attribute is a kind of primary key for the attribute view. But, it doesn’t prevent the passage of duplicate values. It is basically needed to determine the join execution order at the run time when multiple tables are used in an attribute view.
- Will marking a column as key attribute prevent the flow of duplicate values in the output of the attribute view? If yes how? If no then why?
Ans: No, the function of primary key column in the attribute view is to determine the join execution order at run-time but not preventing the duplicates.
- Which is better to use to restrict data? Variables or Input Parameters?
Ans: Input-parameter. Variable acts as a condition in the where clause. Whereas the Input-parameter parameterizes the input data set itself. That is it filters the data even before the data comes to from clause.
- How is a variable different from an Input parameter in the way they function?
Ans: Variable is always associated with a column. But an input-parameter is independent of columns. Variable can only be used to restrict the output data set. But, the parameter can be used to apply some logic for any calculations and restrict data as well. Variable acts as a condition in the where clause. Whereas the Input-parameter parameterizes the input data set itself. That is it filters the data even before the data comes to from clause.
11 .Suppose there is a requirement to use Hierarchies in HANA in the Analytical view, how do we do the modeling for such requirement?
Ans: It is known that hierarchies can’t be created in the analytical views. So we can create the hierarchies in the attribute views and use the same in the analytical view.
- Are the hierarchies created in the Attribute view available/visible in the calculation view?
Ans: No
- Is it better to use a table directly or an attribute view on the table in a calculation view?
Ans: Depends. If all the columns of the table are needed in the output then we can use any of the options available. But, if only few columns are needed then it’s always better to create an attribute view and then use that.
- Which views are executed in which engine in HANA? What are each type of views optimized for?
Ans: Attribute View in Join Engine, Analytical View in OLAP Engine and Calculation View in CE/SQL Engine. But future roadmap by SAP is that they will be eliminating the attribute and analytical views and only CV’s are to remain. In that case the HANA engine automatically will determine the best possible option to execute the view.
- What are the performance improvement techniques used in the project?
Ans: a. using referential or left outer join wherever possible.
- Remove the attribute views from the analytics views, which are not required for the intermediate processing. And join them back at the final calculation view which is exposed to reporting. This brings down unnecessary columns flowing from the bottom layers.
- Use projections only when filters are needed to be applied. This brings down the unnecessary overhead on the cache memory
- Use unions in place of joins wherever possible.
- Use input parameters and database filters at the lowest levels to prevent the flow of unnecessary data.
- Try and avoid calculated columns with IF expressions.
- Avoid joins on calculated columns.
- If calculated columns have few unique values only, in that case instead of calculating a new column try staging the possible values in a custom table and use join on the table.
- Can we restrict data in the aggregation node?
Ans: Yes and NO. There are two aggregations in the view. One which we pull from the tool palette and the other is attached to the semantics node. We can restrict data in the tool palette node and not in the other one.
- How is union in the graphical view different from traditional SQL union?
Ans: It is not necessary to have equal number of columns coming from each of the input sources in the graphical interface.
- What is Union with constant in HANA? Explain one business scenario where you can use that?
Ans: In a union view, a Constant Column is created if there are any target or output attributes for which there are no mappings to the source attributes. The default value for the constant column is NULL.
For example, you have two tables with similar structures, Actual Sales and Planned Sales, corresponding to the sales of products. You want to see the combined data in a single view, but differentiate between the data from the two tables for comparison. To do so, you can create a union view between the two tables and have a constant column indicating constant values like A & P for example, instead of calculating the values in each of the inputs and then passing to the union.
- What is the difference between the standard and derived attribute view? And what is the use of Derived view?
Ans: Standard attribute view can be modified. But, the derived can’t be modified directly. It’s basically a replica of the standard view. The changes made to the standard view are replicated in int derived view as well. In cases where we need to give the read only access to a developer on any view, we can use this option.
- What is the equivalent function of Substring in the graphical interface?
Ans: Midstring. It basically works with syntax midstring (string, start position, end position).
- Suppose we need to restrict the data for the last day of the month and display the closing balance for example, how do we do that?
Ans: Join with the time dimension and create a restricted column of the measure and use the LAST_DAY_OF_MONTH column as the restriction.
- You have to calculate an attribute for example a quarter based on the month. You have the option of using the calculated column or create a z table and stage the data for month to column conversion and then join the table with the transaction table. Which one would you choose and why?
Ans: We can choose the z table option. As the quarter calculation expression might include the ‘if’ statement which limits the parallelization of the execution.
- Suppose an attribute view is built using two tables. and is used in the logical join in analytical view. In logical join can we join on the columns coming from both the tables in the attribute view?
Ans: No. We can use the columns from any one of the tables in the attribute view for joins.
- Why is performance of the script based view bad when compared to graphical view?
Ans: Because the execution of the script view leads to massive data transfer between the database and the engines which can consume good amount of time.
- Can we use input parameters to restrict data? If yes how?
Ans: Yes. We can use the apply filter option within a projection or aggregation and mention the IP as the filter value.
- What is CV with star join?
Ans: It is a view basically used to achieve the functionality of two Analytical views. Here we can have star schema with two fact tables. But, here the attribute views are replaced by the dimension calculation views.
- Hierarchies. What are two types of hierarchies that can be built using HANA studio? Which one have you worked on?
Ans: Level Hierarchy and Parent-child Hierarchy. Explain any one.
- Suppose you have a requirement where you need to combine data from HANA and an external oracle database. How do you do that in HANA?
Ans: Use the HANA Smart Data Access. Here we can configure an external database metadata in the HANA studio, provided the porta are opened up and the drivers are available and configured.
- Explain one design flow where you have calculated YTD (Year to Date) and MTD (Month to Date) key figures in HANA?
Ans: Use a separate projection or aggregation for the whole data set and put a filter using input parameter as the date range. And union this data with the main data set.
- How many data inputs can a join node have? And what is the logical explanation for that?
Ans: Only 2. Because even when multiple tables are used in a join statement in the SQL, at the backend always a join between two tables is executed first and then that result set is joined with the third table. So it logically makes sense to have 2 inputs to join node.
- What is text join and how does it work?
Ans: Text join is a type of join in the HANA. Here a language specific column SPRAS must be present in the key and text tables. In the join properties we can mention the language column as SPRAS. This while execution picks up the HANA session default language and brings the descriptions in that particular language.
- Mention what are the advantage of SLT replication?
Ans: •SAP SLT works on trigger based approach; such approach has no measurable performance impact in the source system
- It offers filtering capability and transformation
- It enables real-time data replication, replicating only related data into HANA from non-SAP and SAP source systems
- It is fully integrated with HANA studios
- Replication from several source systems to one HANA system is allowed, also from one source system to multiple HANA systems is allowed.
- What does Configuration in SLT mean?
Ans: The information required to create the connection between the source system, SLT system, and the SAP HANA system is specified within the SLT system as a Configuration. A new configuration in Configuration & Monitoring Dashboard (transaction LTR) can hence be defined.
- Why is HANA faster in reading the data than other traditional databases?
Ans: HANA is not necessarily faster than the traditional databases when reading data. The speed is mainly because in columnar storage databases only the columns needed can be read. But, in traditional databases even if one column data has to be read all the columns are read and only one column data is displayed. Apart from that HANA reads data from RAM and not disk. SO, this makes HANA even better in terms of data read.
- What all tools can access HANA Information models?
Ans: All the reporting tools can access HANA models, directly or in-directly. For example Design Studio can read the data from a HANA view directly. Whereas, a WEBI report can access HANA models only via a Universe.
- Does HANA support hierarchies?
Ans: Yes. In HANA studio while creating views we can model hierarchies in Attribute and Calculation views. But, not all the front end reporting tools are capable of reading the HANA hierarchies as of now. This is mainly because HANA hierarchies can only be read by the tools which use MDX connections. Excel and Design Studio are two such tools.
- Where can we see step by step execution of HANA view and optimize?
Ans: In Planviz perspective.
- Initial read from HANA database happens from disk or RAM?
Ans: Even the initial read happens from the RAM. As all the data at the start of HANA database gets loaded into the RAM.
- Is there a way to edit a HANA view apart from modeling it in the HANA studio?
Yes: A HANA view even can be created in an XML and then imported into the HANA package. It can even be edited after exporting and then edited view can be re-imported.
- When we have same number of record output coming out of an Aggregation and Projection which one is better to use in a calculation view? And why?
Ans: It is better to use Projection as Aggregation would do a Group by all the attribute columns in the node which is an extra overhead on the system.
- When we have a scenario where a user can input single value or multiple values or even enter nothing, which is the functionality in HANA that can be best leveraged, Input Parameter or Variable?
Ans: Variables. The reason is the Optional Input Parameters don’t work without a default value and setting the default value for IP will fetch only the results for that value.
- Imagine a scenario where we have data from one projection going into multiple projections or aggregations. Which is the better way to model among the two following options?
Option 1: Create 4 similar projections and feed the data into different aggregations.
Option 2: Create one projection and take the output to different aggregations?
Ans: Both are the right way to model. Option 1 gives the highest possible degree of parallelization. But is bad in terms of memory consumption. When dealing with high data volume reports we might run into memory allocation issues. But in option 2 though we reduce the degree of parallelization, we are less likely to run into any run time issues. So, option 1 is better for low data volume reports and option 2 is better for high data volume reports.
- Where are the activated views stored?
Ans: In _SYS_BIC schema.
- What are the different transport mechanisms in HANA?
Ans: HANA CTS+, Native HANA Transports and Manual Export-Import.
- What is needed to be installed to access HANA views in the reporting tools?
Ans: HANA Client.
- Which is the most optimized join in HANA?
Ans: Referential and Left-outer with optimize join option enabled.