TryCatchFinally.net Some SQL, some .NET, and whatever else

3Apr/140

Oracle Goldengate REPLICAT frozen on “Starting”

We use Oracle Goldengate (expensive and probably overkill for Oracle->MSSQL, but good at what it does) to replicate data from an Oracle database into a SQL Server. However, I got an alert the other day that replication had stopped, and when I checked the status of replication, all the REPs we had set up were in status "Starting...", but none we actually doing anything.

Attempting to stop them got the following error:

GGSCI (GGSERVER) 68> stop rep MYREP
Sending STOP request to REPLICAT MYREP ...
ERROR: opening port for REPLICAT MYREP (TCP/IP error: Connection refused).

 

Stopping/Starting the manager service or rebooting the PC didn't help either - they still said "Starting" and were unresponsive. Even stranger, deleting and recreating the REP gave the same result - before I even attempted to start the REP for the first time, it said "Starting", and an attempt to start it gave me "Process is starting up - try again later".

The cause was the REP process status file, located in the DIRPCS folder under the Goldengate root - there should be a file for each REP that's currently running giving details about the status. When a REP stops, this file is deleted. Since all of the current REPs weren't doing anything (they were all sitting at the end of the previous trail file), they should have been stopped. I deleted the PCR files for the affected REP streams, and then manager reporting "STOPPED" - at that point, I was able to start up each REP without issue.

I'm not sure how they got that way, but once started again, they all worked without issue. I hope this saves you the troubleshooting time of hunting down these files!

27Sep/131

Unexpected results in an SSRS report when using Oracle OLEDB driver and OPENQUERY

Here's a pretty specific situation we ran across recently - a comment in our OPENQUERY text was causing the results of our query to be completely different than expected.

The T-SQL query we'd written in SSMS that fetched data from an Oracle linked server was running in about 30 seconds, but when we put the query directly into an SSRS report, it would never finish. We were stumped, since we were executing the exact same query in both SSMS and SSRS, but SSRS appeared to hang when executing it. We trimmed down our original query (a few CTEs containing OPENQUERY, joined to a handful of local tables) until we found the single CTE that was doing it. Executing that query in SSMS (or executing the OPENQUERY portion directly in PL/SQL Developer) returned results in about two seconds - however, putting that same portion on a report (even using the report wizard and with no special formatting) resulted in a report that never finished running, whether in the report designer or deployed to our SSRS site.

As it turns out, the SSRS engine (it occurred in both the BIDS/SSDT and once the report was deployed) was consolidating the OPENQUERY text to a single line when executing it, rather than respecting the line breaks in our query text. As a result, the "--" that preceded a comment mid-way through the query was actually commenting out the entire rest of the query, a la SQL injection style. This resulted in a completely ignored WHERE clause on the query, and different results - effectively, SSRS was taking this query:

SELECT *
  FROM OPENQUERY(OracleServer,
                 'SELECT *
                    FROM SomeSchema.SomeTable -- Note the comment here
                   WHERE ID = 50')

and executing it like this:

SELECT *
  FROM OPENQUERY(OracleServer,
                 'SELECT * FROM SomeSchema.SomeTable -- Note the comment here WHERE ID = 50')

Since our query wasn't returning results, but just hanging for 30 minutes before we gave up, we asked our Oracle admin to watch the execution plan after seeing this Stackoverflow question (and the comments on the accepted answer) that suggested that they were seeing SSRS generate a complete different execution plan for the query (by comparing the plan generated when it was executed from both SSMS and SSRS, I'd hoped the difference would be obvious - it was). This didn't make any sense at all, since the same query should be hitting SQL Server, and so exactly the same OPENQUERY pass-through query hitting Oracle, but the Oracle DBA confirmed that half of our Oracle query was commented out when it came from SSRS. This meant that our 1000 row expected resultset had become a 1.5 billion row dump of the entire table, which would explain why we were seeing the delay on our report!

This didn't happen in SSMS, and it didn't happen when previewing the dataset itself in SSRS, but only occurred when previewing the actual report (this is the only case where SSRS submits a modified version of the query to the SQL engine). Removing the comment from the query resolved the issue, as did switching the comment syntax to /* */ from double-dash.

This occurs both in SQL Server Data Tools (based on VS2010) and BIDS 2008 R2, which is what we had handy to test. Also, it appears to only apply to the Oracle OLEDB driver - creating an Oracle linked server via either ODBC, as well a SQL OLEDB linked server, didn't exhibit the same issue, and the final report results were filtered as expected.

Steps to reproduce the issue:

1. Create a linked server using the Oracle OLEDB driver

2. In SSMS, create a query that accesses the linked server using OPENQUERY with a simple WHERE clause inside

For example, we had something like the following:

SELECT *
  FROM OPENQUERY(OracleServer,
                 'SELECT *
                    FROM SomeSchema.SomeTable
                   WHERE ID = 50')

3. Execute your query and make note of the number of rows it returns

4. Modify your query by adding a short comment at the end of your FROM line - this should not impact the query results at all

SELECT *
  FROM OPENQUERY(OracleServer,
                 'SELECT *
                    FROM SomeSchema.SomeTable -- Note the comment here
                   WHERE ID = 50')

5. Execute your query again - the row count should be unchanged from step 3

6. In BIDS or SSDT, create a new report, either manually or using the wizard, providing your modified query from step 4 as the dataset

7. Once your report is created, view the dataset properties and preview the query results - note that they match the expected row count

To do this, right-click on your dataset in the "Report Data" panel, select "Dataset properties". When it opens to the Properties window, select "Query designer..." near the bottom, and then click the exclamation point in the toolbar to execute your query and preview the results. You'll see your filtered result set, as expected. When you're done, close this window and go back to your report.

8. Preview your report and notice that the WHERE clause is ignored - all rows from your table are displayed

Summary, and take-away

I'm only using the placement of the comment and ignoring the WHERE clause as an example - the comment could be anywhere in the query, and could even result in an query with invalid syntax that refuses to execute at all.

I hope this explanation helps you avoid the two days of troubleshooting we did to get to this point and find the cause! I'm unclear on why the driver behaves this way, or if it's an SSRS issue specifically (I suspect it is, since it doesn't occur in SSMS with the exact same query). If anybody can point me to an open Connect item, I'd be happy to vote for it, but until then, I'm making the effort to migrate to using /* */ comment syntax everywhere - not only is it more clear to readers and flexible for in-line comments, it doesn't break OPENQUERY functionality (and that's reason enough for me).

31May/133

Error creating emailed report subscriptions in Sharepoint-integrated SSRS

Anybody who has the rights to run a report in SSRS can set up a subscription, delivering the report to their own email address. However, if you try to deliver the report as an email to somebody else, you might be greeted with this unclear error message:
Reporting Services - other users email address
It reads:

A subscription delivery error has occurred. (rsDeliveryError)
   One of the extension parameters is not valid for the
   following reason: The account you are using does not
   have administrator privileges. A subscription cannot
   be created for bob.thompson@mydomain.com.
   (rsInvalidExtensionParameter)

I was setting up a subscription in our Sharepoint-integrated SQL Reporting Services site, and was attending to send the report to our shared group email address. I could set it up to deliver to me and I could forward it, but it wasn't letting me send the email to anybody else's email address, either inside or outside the organization.

Solution: Grant your user "Create Alerts" and "Manage Alerts" permission

I received this error because I lacked the "Create Alerts" and "Manage Alerts" on the report in question (or folder, or higher up). The error says you need to be an administrator, but doesn't really tell you what's wrong with your request - it's really complaining that you're delivering an alert to somebody else. Granting those rights to your user account (for that report, or to an object above it so they propagate down to that report) allows you to send the report to any email address you want.

I hope this helps!

23Apr/130

Write permission error when inserting over linked server

I spent some time troubleshoot permissions over a linked server recently before finding out the the cause of my error wasn't permissions-related at all. I was attempting to perform an insert on a remote table, and was getting the following error:

Msg 7344, Level 16, State 1, Line 2
The OLE DB provider "SQLNCLI10" for linked server "RemoteServer" could not INSERT INTO table "[RemoteServer].[RemoteDB].[dbo].[IdentInsertTest]" because of column "ID". The user did not have permission to write to the column.

After some time attempting to isolate the missing permissions, I realized that it was actually a disguised error message. I was trying to insert a value into an identity column, but rather than the standard error message I expected to see in that case, I got a generic "You don't have permission" message, leading to some wasted time troubleshooting.

To recreate the issue, you can follow these steps:

-- Create a test table
CREATE TABLE IdentInsertTest (
	ID INT IDENTITY(1,1),
	SomeValue VARCHAR(10)
)

-- This insert will succeed
INSERT INTO IdentInsertTest (SomeValue)
SELECT 'Some Value'

-- Will fail with IDENTITY_INSERT error
INSERT INTO IdentInsertTest (ID, SomeValue)
SELECT 10, 'Some Value'

The second statement will fail with the standard error message:

Cannot insert explicit value for identity column in table 'IdentInsertTest' when IDENTITY_INSERT is set to OFF.

Now, connect to another server and set up a linked server to the other instance, and then try these statements again:

-- This remote insert will succeed
INSERT INTO LinkedServer.RemoteDB.dbo.IdentInsertTest (SomeValue)
SELECT 'Some Value'

-- Will fail with a permissions error
INSERT INTO LinkedServer.RemoteDB.dbo.IdentInsertTest (ID, SomeValue)
SELECT 10, 'Some Value'

If I'd realized what I was doing, it would have saved me some troubleshooting time! The moral here is that if your statement fails over a linked server, ensure your user account is set up correctly and then test it locally - you may get a more accurate error message!

9Apr/130

Identifying row in SQL view (or table) with a calculation error

Computed columns can be a great tool if you want to add a value to a table that's dependent on other columns in the table, and you don't want to maintain it on its own every time the source columns change. In a view, these can provide a single, consistent calculation to the end users - say, for a

-- Set up a view with the same columns and calculation errors in it
-- The first calculation will always work
-- The second gives a divide by zero error on every 10th row
CREATE VIEW SourceObject AS
SELECT object_id AS KeyColumn1,
       column_id as KeyColumn2,
       object_id - column_id as CalcColumn1,
       (object_id - (column_id % 10)) as CalcColumn2
  FROM msdb.sys.columns

Now that it's set up, we can try selecting all the rows from the view we just created, and we'll see about 100 rows output, and then the query will stop with a "Divide by zero" error:

SELECT * FROM SourceObject

The calculation in this query is pretty straightforward, and you can see which rows are causing a problem (where column_id is divisible by 10), but what if it was more complicated? The problem is that SQL doesn't display the row that had a problem - it stops on the row before the problem, so finding the row with the error is bit more difficult. If there were multiple columns involved in the calculation, or different combinations of values that could result in an error? Tracking down the rows causing an error can be difficult - you have to find all the possible conditions that could cause an error, and then query for each of them.

This script will allow you to find all the rows in a view with a calculation error, all at once. It uses a cursor to pull the rows from the view one at a time, test the calculation, and then write any errors it finds to a table where you can see the rows that are causing problems. Using a CURSOR generally isn't ideal, but in this case, it's the only way to react to a bad calculation on a row-by-row basis and deal with it.

The script can use two key values from your view - they're called KeyColumn1 and KeyColumn2 - and you can modify the script to name them whatever you want, or just a single a value if that makes more sense in your scenario. It also supports two computed columns - CalcColumn1 and 2 - though again, it could be changed to just check a single column.

 -- Set up variables
DECLARE @KeyColumn1 INT,
		@KeyColumn2 INT,
		@CalcColumn1 INT,
		@CalcColumn2 INT
		
DECLARE @CurrentRow BIGINT
	SET @CurrentRow = 1

-- Set up a place to hold key values for rows that work  
  SELECT TOP 0 KeyColumn1, KeyColumn2
    INTO #WorkingRows
    FROM SourceObject

-- Set up a place to hold errors for rows that don't work    
CREATE TABLE #ErrorRows (
	RowNumber BIGINT,
	KeyColumn1 INT,
	KeyColumn2 INT,
	[ERROR_NUMBER] INT,
	[ERROR_MESSAGE] nvarchar(4000)
)    

-- Begin loop to look through rows in the view 
  DECLARE cur INSENSITIVE CURSOR FOR
  SELECT KeyColumn1, KeyColumn2
  FROM SourceObject
  ORDER BY KeyColumn1, KeyColumn2
  
  OPEN cur
  
  FETCH NEXT FROM cur
  INTO @KeyColumn1, @KeyColumn2
	
  WHILE @@FETCH_STATUS = 0
  BEGIN
  
	BEGIN TRY

		-- Try to select the calculated columns	
		-- If there's an error, it will jump to the CATCH block
		SELECT @CalcColumn1 = CalcColumn1,
				@CalcColumn2 = CalcColumn2
		  FROM SourceObject
		 WHERE KeyColumn1 = @KeyColumn1
		   AND KeyColumn2 = @KeyColumn2
		   
		-- This lookup succeeded
		INSERT INTO #WorkingRows
		SELECT @KeyColumn1, @KeyColumn2
		
	END TRY
	BEGIN CATCH
		
		-- The lookup failed - save details
		INSERT INTO #ErrorRows
		SELECT @CurrentRow,
				@KeyColumn1,
				@KeyColumn2,
				ERROR_NUMBER(),
				ERROR_MESSAGE()			
		
	END CATCH
	
	SET @CurrentRow = @CurrentRow + 1
	    
	FETCH NEXT FROM cur
	INTO @KeyColumn1, @KeyColumn2
	    
  END
  
  -- Show the key columns of rows with errors
  SELECT * FROM #ErrorRows
  
  -- Show the key columns of working rows
  SELECT * FROM #WorkingRows
  
  -- Clean things up
  close cur
  deallocate cur
  drop table #ErrorRows
  drop table #workingrows

At the end, you'll have two tables with results in them - #ErrorRows, which contains the key values for rows with errors in them, as well as details about the error message, and #WorkingRows, which contains the key values for all of the working rows from the view.

Note: I could just as easily set up a table with a computed column in it that causes the same problem You'd be unable to select the entire table without an error, and hunting down the row with an error is painful. The script to find the error is the same, but here's an example of a table that has a computed column with this problem:

-- Set up table with a list of numbers in it
SELECT object_id AS KeyColumn1,
		RANK() OVER (PARTITION BY 0
					 ORDER BY NEWID()) as KeyColumn2
  INTO SourceObject
  FROM msdb.sys.columns
  
-- Add two calculations to the table
-- The first will always work
-- The second will give a "Divide by zero" every 100 rows
  ALTER TABLE SourceObject ADD CalcColumn1 as (KeyColumn1 - KeyColumn2)
  ALTER TABLE SourceObject ADD CalcColumn2 as (KeyColumn1 / (KeyColumn2 % 100))

-- Note that you can't add a persisted computed column to a table
-- if there's a calculation error in any of the rows, so this
-- command will fail with a "Divide by zero" error
  ALTER TABLE SourceObject ADD CalcColumn3 as (KeyColumn1 / (KeyColumn2 % 100)) PERSISTED
5Apr/130

Identifying SQL Server 2005 Enterprise Edition features in use

I recently stumbled across a great system view, sys.dm_db_persisted_sku_features, which identifies any enterprise features in use in the current database, which would prevent you from moving this database to a Standard Edition instance. Unfortunately, it appears in SQL 2008, and I wanted to run this check on a SQL 2005 system.

There are a number of server-level features of SQL 2005 that require Enterprise Edition, but only two database-level features - partitioning and the VarDecimal storage format. Both are easy to check for, so I put together this quick query to do it:

select * from
   (-- VarDecimal Storage Format
    select case
             when max(objectproperty(object_id, N'TableHasVarDecimalStorageFormat')) = 0
               then ''
             else 'VarDecimal Storage'
           end as feature_name
      from sys.objects
    UNION ALL
    -- Partitioning
    select case
             when max(partition_number) > 1
               then 'Partitioning'
             else ''
           end
      from sys.partitions
) t
where feature_name <> ''

On a SQL 2005 server, this query will serve the same purpose that sys.dm_db_persisted_sku_features does on SQL 2008/2012 servers.

18Mar/130

Unexpected results from ISNULL in SQL Compact 3.5

In SQL Server, the ISNULL function takes two arguments - the first is a potentially NULL value, and the second is another potentially NULL value. If the first value is NULL, the function returns the second value instead. It's always been this way - for example:

  SELECT V1, V2,
         ISNULL(V1, V2) as [Result],
         Descr
  FROM (SELECT 10 as V1, 20 as V2, 'Returns first' as Descr UNION
        SELECT 10 as V1, NULL as V2, 'Returns first' as Descr UNION
        SELECT NULL as V1, 20 as V2, 'Returns second' as Descr UNION
        SELECT NULL as V1, NULL as V2, 'Both NULL - return NULL' as Descr) t

In SQL Server, you get the following results:

V1      V2      Result  Descr
10      20      10      Returns first
10      NULL    10      Returns first
NULL    20      20      Returns second
NULL    NULL    NULL    Both NULL - return NULL

However, SQL Compact implements ISNULL differently, in a way that's not compatible at all - instead, the ISNULL function returns a 0 or 1 based on whether the first parameter is NULL! It doesn't even process the second parameter, but instead of failing with an "Invalid Syntax" error, it happily accepts it, executes, and returns unexpected results. In SQL Compact, you get the following result set:

V1      V2      Result  Descr
10      20      0       Returns first
10      NULL    0       Returns first
NULL    20      1       Returns second
NULL    NULL    1       Both NULL - return NULL

Notice the difference? The ISNULL column is returning the value based only on the first parameter (a "0" if it's not NULL, and a "1" if it is), and if you're not expecting it, it silently mis-calculates any formulas containing that function. If you're relying on the function in any statements that add values together or make decisions based on the results, you're in for some very expected behavior (in my case, I noticed totals that were wildly incorrect).

As an alternative, you should be using the COALESCE - it accepts multiple parameters and it walks down the list until it finds one that's not null, returning a NULL value if all parameters are NULL. This function can be used in place of ISNULL in SQL Server code with exactly the same behavior (i.e. no changes), and as a bonus, it acts the same in SQL Compact, so your formulas will work properly there as well. The example above becomes:

  SELECT V1, V2,
         COALESCE(V1, V2) as [Result],
         Descr
  FROM (SELECT 10 as V1, 20 as V2, 'Returns first' as Descr UNION
        SELECT 10 as V1, NULL as V2, 'Returns first' as Descr UNION
        SELECT NULL as V1, 20 as V2, 'Returns second' as Descr UNION
        SELECT NULL as V1, NULL as V2, 'Both NULL - return NULL' as Descr) t

And on both SQL Server and SQL Compact, the result set is the same:

V1      V2      Result  Descr
10      20      10      Returns first
10      NULL    10      Returns first
NULL    20      20      Returns second
NULL    NULL    NULL    Both NULL - return NULL

+1 for consistency!

MORAL of the story:

When executing statements on different platforms, ensure the functions you use behave in the same manner on both - and even when considering multiple platforms from the same vendor, named almost identically, don't assume things behave unless you've tested them 🙂

10Mar/130

“Initializing Reconciler has failed” when setting up SQL Compact replication

When initializing replication to a .NET Compact Framework client on a mobile device, I was receiving an error message when I attempted to start the synchronize:

Initializing the SQL Server Reconciler has failed. Try again.

I had confirmed that SQL Compact web replication was set up correctly, and checking the URL came back as expected. Searching for the error online comes back with a dozen recommendations, but when I traced the replication sync attempt, I saw the following statement executed:

exec sp_helpdistpublisher N'SQLSERVERNAME'

Followed immediately by the error message:

The remote server "SQLSERVERNAME" does not exist, or has not been designated as a valid Publisher, or you may not have permission to see available Publishers.

Sure enough, executing that command in SSMS, logged in as my replication user, gave me the same error message. At some point, I'd changed the user I was using to set up the subscription, and that user didn't have rights to view the publication list on my SQL Server. The fix was pretty easy:

  1. In SQL Management Studio, right-click the publication
  2. Select "Properties" and then open the "Publication Access List" tab
  3. Add the user you're connecting your subscriber with to this list

Here's a shot of the screen where I had to make this change, in case there's any confusion:

Publication Security Settings

2Mar/130

Refreshing changed .NET SQL CLR assemblies after patching/updates

After applying some Windows updates to one of my servers, I started getting the following error when I ran a customer .NET SQL-CLR stored proc:

Server: Msg 6522, Level 16, State 2, Line 1
A .NET Framework error occurred during execution of user defined routine or aggregate 'somemethodyourecalling':

System.IO.FileLoadException: Could not load file or assembly 'System.Drawing, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a' or one of its dependencies. Assembly in host store has a different signature than assembly in GAC. (Exception from HRESULT: 0x80131050)

I'd imported some additional assemblies into SQL Server for use in SQL CLR mapped stored procedures, and the Windows updates had included a service pack for .NET, which changed some of the assemblies. Now the version I'd linked to SQL Server no longer existed on disk, and SQL couldn't load it - that made sense, and is documented pretty clearly in this MS Support article: http://support.microsoft.com/kb/949080

However, I had dozens of servers with SQL CLR components enabled, and hundreds of different assemblies loaded across them all, and not always the same in each server, so a standard update script wouldn't work to refresh all the changed assemblies (the MS Support link provides a list of the standard ones that cause that error, but if you've got custom assemblies loaded, or you've loaded an assembly that's not specifically cleared for SQL CLR, then it's not on the list either). To deal with this, I wrote a script that fetches the assembly list for a database and attempts to refresh every one of them from their disk location. If they haven't changed, the update attempt will fail with a specific error message about the MVID, and there's no change for that assembly.

Also, I've commented out the line that restricts it to just framework assemblies (System.* and Microsoft.*), but you can uncomment that line if you'd like to restrict the refresh from attempting to reload your custom assemblies as well.

DECLARE @AssemblyName VARCHAR(255),
	    @AssemblyLocation VARCHAR(255),
	    @AlterAssemblyCommand NVARCHAR(1024),
	    @DotNetFolder VARCHAR(100)

   SET @DotNetFolder = 'C:\Windows\Microsoft.NET\Framework\v2.0.50727'

CREATE TABLE #Results (
		AssemblyName VARCHAR(255),
		AssemblyLocation VARCHAR(255),
		AlterAssemblyCommand NVARCHAR(1024),
		Results VARCHAR(1024)
)

select sa.name as AssemblyName,
		saf.name as Assemblylocation,
		case when charindex('', saf.name) = 0
			then 'ALTER ASSEMBLY [' + sa.name + '] FROM ''' + @DotNetFolder
			else 'ALTER ASSEMBLY [' + sa.name + '] FROM '''
		end + saf.name + (case right(saf.name, 4) when '.dll' then '' else '.dll' end) + ''''
		as AlterAssemblyCommand
INTO #Refresh
from sys.assemblies sa
join sys.assembly_files saf
  on sa.assembly_id = saf.assembly_id
where sa.name <> ('Microsoft.SqlServer.Types')
  --and (sa.name like 'System.%' or sa.name like 'microsoft.%')

DECLARE Commands CURSOR FAST_FORWARD FOR
SELECT AssemblyName,
	   AssemblyLocation,
	   AlterAssemblyCommand
  FROM #Refresh

OPEN Commands

FETCH NEXT FROM Commands
INTO @AssemblyName,
	   @AssemblyLocation,
	   @AlterAssemblyCommand

WHILE @@FETCH_STATUS = 0
BEGIN

	BEGIN TRY
		exec sp_executesql @AlterAssemblyCommand

		insert into #Results
		SELECT @AssemblyName,
				@AssemblyLocation,
				@AlterAssemblyCommand,
				'Assembly refreshed successfully'

	END TRY
	BEGIN CATCH

		insert into #Results
		SELECT @AssemblyName,
				@AssemblyLocation,
				@AlterAssemblyCommand,
				CASE ERROR_NUMBER()
					WHEN 6285 THEN 'No update necessary (MVID match)'
					WHEN 6501 THEN 'Physical assembly not found at specified location (SQL Error 6501)'
					ELSE ERROR_MESSAGE() + ' (SQL Error ' + convert(varchar(10), ERROR_NUMBER()) + ')'
				END

	END CATCH

	FETCH NEXT FROM Commands
	INTO @AssemblyName,
		   @AssemblyLocation,
		   @AlterAssemblyCommand

END

CLOSE Commands
DEALLOCATE Commands

SELECT * FROM #Results

drop table #refresh
drop table #Results

While troubleshooting the error, I came across this as well - I don't have a SQL 2012 server handy to check with, it looks like this problem might be resolved with a reboot in SQL 2012:

http://msdn.microsoft.com/en-us/library/hh479773.aspx

I'm not sure if that's the case, but it would make things easier. Also, it would be nice if there was some kind of "ALTER ASSEMBLY [SomeAssembly] REFRESH" command to address this problem, but maybe in a future version.

Additionally, this error can be caused if the signing of an assembly has changed, but not necessarily the signature, but just reloading it from disk won't work because the method that SQL Server uses to determine if there's been a change is the MVID, and that doesn't change unless there's been a code change of some kind (see the Connect bug here). In those cases, you'll actually need to drop and recreate the assembly, including any supporting objects that rely on that assembly.

13Feb/132

Matching up SQL CE DLL version numbers

The Abstract:

While developing a Windows Mobile device application on two different computers, I thought I'd installed SQL Server Compact Edition 3.5 SP2 on both of them, but ran into some version difference issues. In researching it, I couldn't find good information about the different versions of the System.Data.SqlCompactCe DLL, so thought some future developers might enjoy what I found out, all in one place.

The Problem:

After checking the project into source control on one computer and fetching it on the other, I'd see a broken reference to System.Data.SqlServerCe:

Missing reference to System.Data.SqlServerCe

Missing reference to System.Data.SqlServerCe

And I received the following message (and about 100 errors - one for every reference to the namespace in my code) when I attempted to compile:

Warning 10
Could not resolve this reference. Could not locate the assembly "System.Data.SqlServerCe, Version=3.5.1.0, Culture=neutral, PublicKeyToken=3be235df1c8d2ad3, processorArchitecture=MSIL". Check to make sure the assembly exists on disk. If this reference is required by your code, you may get compilation errors.

If I dropped the reference, re-added it on my second computer (BTW - only one version was listed in my .NET reference list on both machines) and checked it back in, then I'd get the same situation on my first computer with the following error message:

Warning 10
Could not resolve this reference. Could not locate the assembly "System.Data.SqlServerCe, Version=3.5.0.0, Culture=neutral, PublicKeyToken=3be235df1c8d2ad3, processorArchitecture=MSIL". Check to make sure the assembly exists on disk. If this reference is required by your code, you may get compilation errors.

The Research:

I had a DLL version mismatch - a general problem when you've got multiple computers you're developing something on, since you have to ensure the same versions and packages are installed on each. However, searching for details on the different versions of this DLL in the wild wasn't fruitful - did I have SQL Compact 3.5 RTM installed (3.5.0.0)? If I'd installed v3.5 SP2, why was it shown as 3.5.1.0 (shouldn't it be 3.5.2.0?) on one machine, but as 3.5.0.0 the other? To top that off, both versions (v3.5.0.0 and v3.5.1.0) were in my GAC as MSIL - what?

It turns out that when you add a reference to this DLL in a Compact Framework project, Visual Studio is using the following version (depending on whether you're on an x86 or x64 machine):

x86: C:\Program Files\Microsoft SQL Server Compact Edition\v3.5\Devices\System.Data.SqlServerCe.dll
x64: C:\Program Files (x86)\Microsoft SQL Server Compact Edition\v3.5\Devices\System.Data.SqlServerCe.dll

There, I found my issue - that file version was a different version on each machine (3.5.5692.0 on one and 3.5.8080.0 on the other). To find out why, I had to dig into the registry, where I found all kinds of version information about the SQL Server Compact Edition components I'd installed, the DLL version, and the service pack levels:

x86 or x64 components
HKLM\SOFTWARE\Microsoft\Microsoft SQL Server Compact Edition\v3.5\ENU

or if you want to see x86 components installed on your x64 machine:
HKLM\SOFTWARE\Wow6432Node\Microsoft\Microsoft SQL Server Compact Edition\v3.5\ENU

There, you'll find a combination of "Version" and "ServicePackLevel" entries for every SQL CE component you have installed, and it will shed some light. Here are the version numbers I found (for the various components):

SQL Compact 3.5 RTM
ServicePackLevel: 0
Version: 3.5.5386.0

SQL Compact 3.5 SP1
ServicePackLevel: 1
Version: 3.5.5692.0

SQL Compact 3.5 SP2
ServicePackLevel: 2
Version: 3.5.8080.0

So how did I end up with the SP1 version installed on one machine, but not on the other? Since I'd installed SQL Compact SP2 on both machines, I was baffled. As it turns out, Visual Studio 2008 RTM deploys SQL Compact 3.5 RTM (3.5.5386.0), and applying Visual Studio 2008 SP1 patches that to the SP1 version (3.5.5692.0) - notice the VSToolsVersion/VSToolsServicePackLevel values in the registry location above, which reflect your current service pack level of Visual Studio 2008.

The Solution:

While I'd installed VS 2008 SP1 on both machines, I'd installed the actual SQL Compact 3.5 SP2 for Windows Mobile (EDIT 2016-10-31: This link has been updated to the current location for this download) package on only one machine and not the other - this left the DLL on one machine patched to SP1 (courtesy of Visual Studio) and the other machine with a fully-patched SP2 DLL (courtesy of the actual SP2 Windows Mobile installer).

I confirmed this by running the SP2 for Windows Mobile installer, and it patched my DLL right up to the full SP2 version - problem resolved, and the project now opens up on both workstations with no need for any DLL reference swapping.

The Moral:

Honestly, a bit embarrassing and a pretty amateur move where some vigilance could have saved me quite a bit of aggrivation.

ENSURE YOU'RE DEPLOYING ALL THE SAME PACKAGES ON ALL DEVELOPMENT MACHINES! JUST BECAUSE SOMETHING ALREADY EXISTS ON ONE MACHINE, DON'T ASSUME IT'S THE SAME VERSION!