Troubleshooting File Not Found Exception" - "Could not load file or assembly"
I'm experiencing this on Ubuntu, using .net 6, trying to use SQLite.
This same application was talking to SQLite just fine yesterday - but now:
"System.IO.FileNotFoundException: Could not load file or assembly 'Microsoft.Data.Sqlite, Version=7.0.5.0, Culture=neutral, PublicKeyToken=adb9793829ddae60'. The system cannot find the file specified."
The file Microsoft.Data.Sqlite
is right there in the same folder as the binary.
On windows, to really deeply troubleshoot this we used to use fuslogvw
(fusion log viewer) -- and another technique was to overload the Assembly Resolve method and log what it's doing when it fails.
(i.e. AppDomain.CurrentDomain.AssemblyResolve += AssemblyUtils.AssemblyResolver;
-- a technique I've used a lot, initially inspired by this code from Joseph "Linqpad" Albahari: C#4 Chapter 17 - Assemblies - C# in a Nutshell - Code Listings )
On ubuntu... I don't know what to do! -- but some research has shown me a way!
TL/DR -- at this point I went down a dotnet-trace
rabbit hole, which you can go ahead and read, or you can skip down to Running with the LD_DEBUG Environment variable to see how I used the LD_DEBUG
environment variable to really solve the issue. I'm sure dotnet-trace
will be useful for other issues.
There is a tool called dotnet-trace
-- hence here is a guide to:
Troubleshooting File Not Found Exception" - "Could not load file or assembly" with dotnet-trace
1. install dotnet-trace
Install dotnet-trace
on the machine where the app is failing.
Like so:
dotnet tool install --global dotnet-trace
The last part of the output from this:
You can invoke the tool using the following command: dotnet-trace
Tool 'dotnet-trace' (version '7.0.421201') was successfully installed.
Now -- the basic process is:
- Run your application (the one that experiences the problem)
- Find out its process id
- Tell
dotnet-trace
to trace that process id, and capture events related to assembly loading - Stop
dotnet-trace
- Find the file it has created
- Import that file onto your windows machine and inspect it there (using visual studio)
(There's other ways you can do those last steps -- you could convert it to a different format on linux (speedscope
) and inspect it right there on linux... or bring it to Windows/Mac and use tools other than Visual Studio to do the analysis.)
1. Run your application (the one that experiences the problem)
In my case the application is always running, supervisor takes care of that.
Otherwise -- run it from the console. And then leave that console open. From a different console, you will do the next steps:
2. Find out its process id
This is something I often have to do anyway... I use "ps x | grep STRING" where string is a part of the name of the application.
e.g.
ps x | grep dot
Shows me all running apps whose command line include "dotnet"
Once of those is my target app --
1751 ? SLl 0:07 /usr/bin/dotnet /opt/webapps/mywebapp/app/myapp.dll
3. Tell dotnet-trace
to trace that process id, and capture events related to assembly loading
Now we know the process id is 1751
-- we run dotnet-trace
and give it that proess id.
We also tell it which events to capture (I'll explain in a minute)
dotnet-trace collect --providers Microsoft-Windows-DotNETRuntime:4:4 --process-id 1751
The provider Microsoft-Windows-DotNETRuntime
is relevant for all the assembly load / fail events.
The 4:4 .... the first 4 refers to which items from this list: Microsoft-Windows-DotNETRuntime:4:4
Keyword String Alias | Keyword Hex Value |
---|---|
gc | 0x1 |
gchandle | 0x2 |
fusion | 0x4 |
loader | 0x8 |
jit | 0x10 |
ngen | 0x20 |
startenumeration | 0x40 |
endenumeration | 0x80 |
security | 0x400 |
appdomainresourcemanagement | 0x800 |
jittracing | 0x1000 |
interop | 0x2000 |
contention | 0x4000 |
exception | 0x8000 |
threading | 0x10000 |
jittedmethodiltonativemap | 0x20000 |
overrideandsuppressngenevents | 0x40000 |
type | 0x80000 |
gcheapdump | 0x100000 |
gcsampledobjectallocationhigh | 0x200000 |
gcheapsurvivalandmovement | 0x400000 |
gcheapcollect | 0x800000 |
gcheapandtypenames | 0x1000000 |
gcsampledobjectallocationlow | 0x2000000 |
perftrack | 0x20000000 |
stack | 0x40000000 |
threadtransfer | 0x80000000 |
debugger | 0x100000000 |
monitoring | 0x200000000 |
codesymbols | 0x400000000 |
eventsource | 0x800000000 |
compilation | 0x1000000000 |
compilationdiagnostic | 0x2000000000 |
methoddiagnostic | 0x4000000000 |
typediagnostic | 0x8000000000 |
...we are after "fusion" -- so we want "4". If we wanted fusion(4) plus GC (1) then we'd give the value "5".
What about the second number?
I said to use:
"Microsoft-Windows-DotNETRuntime:4:4"
The first 4 was 'fusion' - the second number is the clreventlevel
(see)
Level | Meaning | Description |
---|---|---|
0 | LogAlways | No level filtering is done on the event. When used as a level filter for enabling events, for example in EventListener.EnableEvents(), events of all levels will be included. |
1 | Critical | This level corresponds to a critical error, which is a serious error that has caused a major failure. |
2 | Error | This level adds standard errors that signify a problem. |
3 | Warning | This level adds warning events (for example, events that are published because a disk is nearing full capacity). |
4 | Informational | This level adds informational events or messages that are not errors. These events can help trace the progress or state of an application. |
5 | Verbose | This level adds lengthy events or messages. It causes all events to be logged. |
Now that it's running, do whatever is needed to "cause" the File Not Found error you want to debug.
4. Stop dotnet-trace
Stop dotnet-trace with Ctrl-C.
5. Find the file it has created
You will see that it has written a *.nettrace
file in your home/user
(~
) folder -- it will tell you the exact name in its output.
6. Import that file onto your windows machine and inspect it there (using visual studio)
Copy that *.nettrace
file onto your windows machine.
Open it with visual studio --- sometimes this fails. I found I needed a "clean" instance of visual studio.
First it converts the file into a *.nettrace.etlx
file.
Now look at what the events says.
Look really hard.
Look until the beads of blood begin to pool on your forehead.
This is where I ran out of luck atm.
But tomorrow brings fresh hope.
Other thoughts
From "How the runtime locates assemblies"
Determines the correct assembly version by examining applicable configuration files, including the application configuration file, publisher policy file, and machine configuration file. If the configuration file is located on a remote machine, the runtime must locate and download the application configuration file first.
Checks whether the assembly name has been bound to before and, if so, uses the previously loaded assembly. If a previous request to load the assembly failed, the request is failed immediately without attempting to load the assembly.
Note - The caching of assembly binding failures is new in .NET Framework version 2.0.
Checks the global assembly cache. If the assembly is found there, the runtime uses this assembly.
Probes for the assembly using the following steps:
If configuration and publisher policy do not affect the original reference and if the bind request was created using the Assembly.LoadFrom method, the runtime checks for location hints.
If a codebase is found in the configuration files, the runtime checks only this location. If this probe fails, the runtime determines that the binding request failed and no other probing occurs.
Probes for the assembly using the heuristics described in the probing section. If the assembly is not found after probing, the runtime requests the Windows Installer to provide the assembly. This acts as an install-on-demand feature.
Note: There is no version checking for assemblies without strong names, nor does the runtime check in the global assembly cache for assemblies without strong names.
Note this --
If a codebase is found in the configuration files, the runtime checks only this location. If this probe fails, the runtime determines that the binding request failed and no other probing occurs.
"codebase" here is an element, as in <codebase>...</codebase>
.
Note that assembly binding failures are cached, not repeated.
Hence you may fail to capture the failure, if its not performed.
To revert to the behavior of the .NET Framework versions 1.0 and 1.1, which did not cache binding failures, include the
<disableCachingBindingFailures>
Element in your configuration file.
How is probing performed:
Application base, which is the root location where the application is being executed.
Culture, which is the culture attribute of the assembly being referenced.
Name, which is the name of the referenced assembly.
The privatePath attribute of the
<probing>
element, which is the user-defined list of subdirectories under the root location. This location can be specified in the application configuration file and in managed code using the AppDomainSetup.PrivateBinPath property for an application domain. When specified in managed code, the managed code privatePath is probed first, followed by the path specified in the application configuration file.
Running with the LD_DEBUG Environment variable
In the error message it said:
System.DllNotFoundException: Unable to load shared library 'e_sqlite3' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable: libe_sqlite3: cannot open shared object file: No such file or directory
So I decided to take the advice and "consider setting the LD_DEBUG environment variable"
To do this -- I ran the app preceded by LD_DEBUG="libs"
like so:
LD_DEBUG="libs" /usr/bin/dotnet /opt/webapps/myapp.dll -r linux-x64 --sc
(The -r linux-X64
and --sc
were already part of my command for starting the app. I'm running it as a "self-contained" app, on 64 bit linux.)
No it gave a lot more output and I got to see a much cleared picture of what was actually failing.
There was a few hundred lines of output -- i was interested in what it was saying overall, as well as the first and second major error and how it responded to it.
Here's just those bits...
LD_DEBUG="libs" /usr/bin/dotnet /opt/webapps/myapp.dll -r linux-x64 --sc
27968: find library=libpthread.so.0 [0]; searching
... (100 lines gone)
27968: calling init: /opt/webapps/libcoreclrtraceptprovider.so
27968:
27968: /usr/bin/dotnet: error: symbol lookup error: undefined symbol: DllMain (fatal)
27968: /opt/webapps/libcoreclr.so: error: symbol lookup error: undefined symbol: PAL_RegisterModule (fatal)
27968:
27968: calling init: /opt/webapps/libclrjit.so
27968:
27968: /opt/webapps/libclrjit.so: error: symbol lookup error: undefined symbol: DllMain (fatal)
27968: /usr/bin/dotnet: error: symbol lookup error: undefined symbol: DllMain (fatal)
27968: find library=e_sqlite3.so [0]; searching
`libclrjit.so` is trying to find `e_sqlite3.so`
27968: search cache=/etc/ld.so.cache
27968: search path=/lib/x86_64-linux-gnu/tls/x86_64:/lib/x86_64-linux-gnu/tls:/lib/x86_64-linux-gnu/x86_64:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu/tls/x86_64:/usr/lib/x86_64-linux-gnu/tls:/usr/lib/x86_64-linux-gnu/x86_64:/usr/lib/x86_64-linux-gnu:/lib/tls/x86_64:/lib/tls:/lib/x86_64:/lib:/usr/lib/tls/x86_64:/usr/lib/tls:/usr/lib/x86_64:/usr/lib (system search path)
27968: trying file=/lib/x86_64-linux-gnu/tls/x86_64/e_sqlite3.so
27968: trying file=/lib/x86_64-linux-gnu/tls/e_sqlite3.so
27968: trying file=/lib/x86_64-linux-gnu/x86_64/e_sqlite3.so
27968: trying file=/lib/x86_64-linux-gnu/e_sqlite3.so
27968: trying file=/usr/lib/x86_64-linux-gnu/tls/x86_64/e_sqlite3.so
27968: trying file=/usr/lib/x86_64-linux-gnu/tls/e_sqlite3.so
27968: trying file=/usr/lib/x86_64-linux-gnu/x86_64/e_sqlite3.so
27968: trying file=/usr/lib/x86_64-linux-gnu/e_sqlite3.so
27968: trying file=/lib/tls/x86_64/e_sqlite3.so
27968: trying file=/lib/tls/e_sqlite3.so
27968: trying file=/lib/x86_64/e_sqlite3.so
27968: trying file=/lib/e_sqlite3.so
27968: trying file=/usr/lib/tls/x86_64/e_sqlite3.so
27968: trying file=/usr/lib/tls/e_sqlite3.so
27968: trying file=/usr/lib/x86_64/e_sqlite3.so
27968: trying file=/usr/lib/e_sqlite3.so
27968:
27968: /lib/x86_64-linux-gnu/libc.so.6: error: version lookup error: version `GLIBC_2.28' not found (required by /opt/webapps/libe_sqlite3.so) (fatal)
And it ends by saying that /opt/webapps/libe_sqlite3.so
(the sqlite binary I have included) requires GLIBC_2.28
- The Gnu C Library, version 2.28.
It seems that on my virtual machine I have version 2.23 not 2.28
Found by running this command, and looking at its output:
$ ldd --version
ldd (Ubuntu GLIBC 2.23-0ubuntu11.3) 2.23
Copyright (C) 2016 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.
Or via this other technique ---
Run this to find libc
...
$ ldd `which ls` | grep libc
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f8d676e8000)
Then run the libc
library ....
/lib/x86_64-linux-gnu/libc.so.6
/lib/x86_64-linux-gnu/libc.so.6
GNU C Library (Ubuntu GLIBC 2.23-0ubuntu11.3) stable release version 2.23, by Roland McGrath et al.
Copyright (C) 2016 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 5.4.0 20160609.
Available extensions:
crypt add-on version 2.1 by Michael Glad and others
GNU Libidn by Simon Josefsson
Native POSIX Threads Library by Ulrich Drepper et al
BIND-8.2.3-T5B
libc ABIs: UNIQUE IFUNC
For bug reporting instructions, please see:
<https://bugs.launchpad.net/ubuntu/+source/glibc/+bugs>.
Further research indicated that this is typical, on Ubuntu 16.04 you have glibc2.23.
I need to upgrade my ubuntu version (or do some other gnarly stuff).
And not only do I need to upgrade from ubuntu 16.04 to ubuntu 18.04 (the next long term support version) -- I then, immediately need to upgrade to Ubuntu 20.04, as the 5 year long support window for Ubuntu 18.04 ended literally yesterday (2023-05-31
), the day I was researching this.
Never fear -- I found that by downgrading to use an older version of the SQLite wrapper libraries, I could get the sqlite calls to continue working on Ubuntu 16.
I will separately prepare a new VM with all upgraded parts and then hot-swap everything over to it, like Indiana Jones swapping that golden skull for a bag of sand. Hopefully I don't end up having to outrun a giant rolling ball. But in any case, as they say in Latin, I have my running shoes on. (It sounds better in Latin, sorry I can't do the voices.)
Sources
- dotnet-trace performance analysis utility
- tracing event levels
- .NET runtime events
- Loader and binder runtime events - .NET | Microsoft Learn
- How the runtime locates assemblies
- C#4 Chapter 17 - Assemblies - C# in a Nutshell - Code Listings and C#10 Chapter 17 - Assemblies - C# in a Nutshell - Code Listings