Tuesday 28 February 2012

Google's Web History policy

Google is changing its web history policy from March 1 2012. Which means you will not be able to delete the history of your sessions in all google products. Personally I kind of like history. I can go back if I suddenly remember I was doing something important last month on the 25th but cannot recall the specifics. I can then go to the web history page and find out exactly what i was searching or what web pages I was visiting on that day.

I do not think this will violate anybody's privacy in anyway. Privacy will be compromised if someone else gets to know your password. Well passwords compromise is not just Google's problem - it is every other provider's problem.So Its personal , Keep history or get rid of it.

Saturday 21 January 2012

How much does SQL Server cost?

SQL Server 2008 R2  standard edition will cost about AUD 30,000 .  This includes the cost of the server machine and the SQL server license. The cost I am talking about is when a corporations buy the server pc as well as the license in bulk. It may cost more if not negotiated. Well, it might  cost less as well with more negotiation power.

Monday 9 January 2012

Cannot start Microsoft Outlook. Cannot open the Outlook window. Invalid XML, the view cannot be loaded.

what is this error? I get this which I try to open Microsoft Outlook 2007 on a XP machine.

Cannot start Microsoft Outlook. Cannot open the Outlook window. Invalid XML, the view cannot be loaded.

Sunday 8 January 2012

What is a Data Warehouse ?

Data Warehousing is a  relational database storage of data for analysis purpose. Its usually separate from transaction processing system. There is usually an OLAP system that sits on top of this. Data warehouse is separated from the online transactional database because online transaction processing database should not be impacted in terms of performance by running analytical queries.Moreover, historical data are periodically removed from live transactional database to maintain its performance. These historical data are then migrated into a data warehouse.  Datawarehouse is used to study the past behaviour of the data and develop strategies based on the findings. For example: Sales of product is found to be higher during christmas period by analysing last  3 years of data. Businesses can now make a decision to increase the production  accordingly to meet the demand made by  increase in sales.
In many companies, transactional data arrives to them from other providers on a flat/delimited file. There is usually an ETL process to load these flat files into the Datawarehouse. Datastage, Informatica are the modern popular ETL tools. There are also less popular tools such as Sagent in the market that is being used by a few companies. SQL Server DTS and JCL are also still being used by a lot of companies for ETL processing.
There are many open source ETL tools out there. I have myself used a couple of them. They do the job very well. They are as user friendly as expensive tools such as Datastage and Informatica. So if you are thinking of installing new ETL tool in your company and do not want to go with buying expensive tools, I recommend using Open Source tools.
Cognos and Hyperion are among most common OLAP tool used in the market. They can process huge amount of data. All they basically do is help you to slice and dice data to extract some information. You can think of them as Pivot Table function in Microsoft Excel.  Microsoft Excel 2007 has a limitation on the number of records it can handle. It can handle only 1,000,000 rows. Cognos can handle much more records than this. But I think newer version of Excel can handle more records. I have not used them myself so I cannot confirm that.  There are also other cheaper and free open source OLAP tools out there. They do the job as well as the expensive tools. I am a big fan of open source tools.

Please let me know if you have a question and I will be happy to answer it for you.

Friday 6 January 2012

SQL Server : Can I enable or disable index when the table is being loaded??

You can absolutely alter index when the corresponding table is being loaded.

Many times I trigger ETL processes without considering the indexes.  Then after sometime I realise that the load which was meant to finish very quickly  did not finish for a considerable amount of time.  This usually happens for new development or test loads.

For production load, I disable indexes if I have to load huge data set into the tables. It is not always possible to disable an index on a production table if it is being used by multiple users when the loading is taking place. Disabling indexes will slow the performance for users who are querying the table.

To read how to enable/disable index, please read sql-server-enable-and-disable-index.html

Example:

CUSTOMER table has an index named IDX_USERID on USERID field.

When web user JOHN logs in , IDX_USERID is used by SQL server to quickly extract details of user JOHN. CUSTOMER table has tens of millions of customer records. Without the index it would take forever for SQL query to extract details for the required customers. Therefore index is very important for user experience as people do not like systems that have slow response.

If this system is a 24x7 available system for users, then its not possible to disable indexes on this table during loading. But usually most of the systems have a down time of atleast a few hours everyday. This downtime is used by support staff to quickly load the tables.

Hope this helped

SQL Server : Enable and Disable Index

I spent half of my day today to enable indexes that i disabled yesterday during data load. I disabled most of the indexes in tables that I was loading to. Indexes slow down data loading as they have some overheads. After loading each record, SQL server goes on to update index definitions. This will impact the loading time.

I therefore usually disable the indexes before load.

Today I tried to re enable indexes using SQL Server Management Studio 2008R2.  To enable indexes I right clicked on the index name in the explorer and clicked 'Rebuild' . This took ages to run and bring up the next window which needs to be clicked to rebuild the index. I then went into the query editor window and enabled index manually using the below SQL query

ALTER INDEX index_name ON table_name REBUILD

It was a lot faster to rebuild it this way.

So remember: if the table is very small, go for index rebuilding using SQL Server Management Studio. Otherwise use the Query Editor and rebuild index manually.

To Disable Index, Management Studio is quite fast. just right click on the  index and select 'Disable'. this works like a treat.

or alternately , to manually disable index

ALTER INDEX index_name ON table_name DISABLE