Blog
Bulk Inserting data using Entity Framework

Bulk Inserting data using Entity Framework

Using tools like Entity Framework makes life far easier for a developer. Recently I blogged about how using them is what makes .Net Core one of the best platforms for prototype development, but the benefits don’t end there. They are also great from a security perspective by cutting a lot of risk around SQL injection attacks just by avoiding easy mistakes when using regular ADO.NET.

However, they do have some downsides, a main one being that they are particularly slow when it comes to doing bulk inserts to a database.

For example, assume you have an application which regularly receives an xml import file consisting of 200,000 records and each one either needs to be an insert of an update into the db. You’ll quickly learn that looping through the whole lot and then calling save changes results in a process taking an extremely long time to run, it may even just timeout. You then decide to get rid of that long save changes line by breaking it up into blocks of 500 and call save changes for each of those. That may save the timeout issue, but it still results in a process potentially lasting around an hour.

The problem is that this is a scenario Entity Framework or EF.Core just weren’t designed to handle. As a solution you could opt to drop Entity Framework altogether and revert to something like a native SQL Bulk Insert command, but what if you need to be doing some processing in code on the record before the import happens? What if you have one of those classic not quite always valid XML, XML files which would cause SQLs Bulk Insert to fail.

The solution is to use an open source extension called EFCore.BulkExtensions.

EFCore.BulkExtensions

EFCore.BulkExtensions is a set of extension methods to Entity Framework that provide the functionality to do bulk inserts. You can add it to your project using NuGet and you’ll find the project on GitHub here https://github.com/borisdj/EFCore.BulkExtensions

Usage is also very simple to do. Let’s assume you have some existing tradition EF code that loops through a collection and for each one create a new db item and adds it to the db:

1public void DoImport(List<foo> collection)
2{
3 foreach (var item in collection)
4 {
5 Jobs job = new Jobs();
6
7 job.DateAdded = DateTime.UtcNow;
8 job.Name = item.Name;
9 job.Location = item.Location;
10
11 await dbContext.Jobs.AddAsync(job);
12 }
13
14 await dbContext.SaveChangesAsync();
15}

Rather than adding each item to the Entity Framework db context, you instead create a list of those objects and then call a BulkInsert function with them on your db context.

1public void DoImport(List<foo> collection)
2{
3 List<Jobs> importJobs
4 foreach (var item in collection)
5 {
6 Jobs job = new Jobs();
7
8 job.DateAdded = DateTime.UtcNow;
9 job.Name = item.Name;
10 job.Location = item.Location;
11
12 importJobs.Add(job);
13 }
14
15 await dbContext.BulkInsert(importJobs);
16}

If also works for updates, but rather than creating a new item, first retrieve it form the db and then at the end call BulkInsertOrUpdate with the list.

1await dbContext.BulkInsertOrUpdate(importJobs);

From my experience doing this took my import process that would run for over an hour down to something which would complete in a few minutes.

Top reasons .Net is amazing for prototyping

Top reasons .Net is amazing for prototyping

The other day someone told me .net was slow to get something built, and to be fair to the person I can see why he would have thought that. Most of his interaction with .net projects have been on complex with large enterprise applications that often have integrated multiple other applications.

However, I would maintain that .net is a framework that is actually really fast to develop on and that what he had perceived as being slow was the complexities of a project rather than the actual coding time.

In fact, I would say it's so fast to get something built in it, it actually becomes the fastest thing to develop a prototype in. Here are my top reasons why.

ASP.NET Core

ASP.NET Core inherits all the best bits from the ASP.NET Framework that came before it, giving the framework almost 20 years of refinement since its original release in 2002. The days of WebForms in the original ASP.NET are now long behind us and we now have the choice of building web applications with either MVC or Razor Pages.

Razor provides the perfect combination of a view language providing helpers to render your html without limiting what can be done on the front end. How you code your HTML is still completely up to you, the helpers just provide features like binding that make it even faster to do.

Another great thing about .Net core over that which came before it, is its platform independent. Rather than being confined to just Windows, you can run it on Mac or Linux too.

Great starter templates

What kicks of a great prototype project is starting with great templates, and ASP.NET Core has a bunch.

As already mentioned, you can build a Web Application with either Razor Pages or MVC, but the templates also provide you with the base for building an API, Angular App, React.js or React.js and Redux or you can simply create an Empty application.

My preference is to go for MVC as it's what I'm the most familiar with and a key thing for rapidly building a prototype is that you develop rapidly. The idea is to focus on creating something new and unique, not learn how to develop in a new framework.

The MVC Web Application gives you a base site to work with a few pages already set up, bootstrap and jQuery are already included so you start right at the point of working on your logic rather than spending time doing setup.

SQL Server and EF Core

I've always been a bit of a database guy. I'm not sure why, but its a topic that has always just made sense to me, and despite being a topic that can get quite complex, the reasons behind it being complex always feel logical.

When it comes to building a prototype though there are two aspects which make storage with .net core super simple.

Firstly, Entity Framework Core (EF Core) means you don't really need to know any SQL or spend any time writing it. It helps if you do, but at a minimum all you need to be doing is creating a model in your code, adding a few lines for a DB context that tells EF.Core that a model is a table and how they relate. Then turn on migrations and you're done. When you run the application, the DB gets created for you and each time you change your model, you just add another migration and the next time the application runs the application the DB schema gets updated.

Querying your DB is done by writing LINQ queries against your entity framework model, allowing you to have next to no understanding of SQL and how the DB works. Everything is just done by magic for you.

The second part is SQL Server and its different versions. Often when you think of SQL Server you think of the big DB engine with its many many components that you install on a server or your local machine but there's two others which are even more important. LocalDB and Azure SQL.

LocalDB is an option that can be installed as part of Visual Studio. No separate application is needed or services to be running in the background. It is essentially the minimum required to start the DB engine to be used for development purposes. In practical terms this means when you start your application EF.Core can run off a LocalDB which didn't require any setup, but as far as your application in concerned it is no different than working with any other version of SQL Server.

Azure SQL as the name implies is SQL Server on Azure. The only thing I really need to say about this is that you can swap LocalDB and Azure SQL with ease. They may be different but as far as your prototype is concerned, they are the same.

Scaffolding

The only thing quicker than writing code is having someone else do it for you. So we've created our application from a template, added a model which generated our database and now its time to create some pages. Well the good news is it's still not time to write much code because Visual Studio can scaffold out pages based on our model for us!

Adding a controller to our project in Visual Studio gives us some options on what should be generated for us, one of which is MVC Controller with views, using Entity Framework. What that means is given a model it will create controllers and views for listing items, creating them, editing them and deleting them. No coding by us required!

Now it's unlikely that is exactly what you're after, but it's generally a good starting place and deleting code you don't need is far quicker then writing it.

Azure

Lastly there is Azure. You may have spotted a theme to all these points and that is they all remove any effort required to do any setup and instead focus on building your own logic, and this point is no different.

I remember a time, when if I wanted a server to put an application on, I had to request it, and then wait a while. What I would get back would either be a server that already had resources running on it, or a blank server that would need applications installed on it. e.g. SQL Server or .Net Framework. IIS wouldn't have been configured and it would be a number of hours before my application would be running.

With Azure you don't even really need to leave Visual Studio. From the publish dialog box you can create a new App Service and DB, and then publish. All the connection strings are sorted out for you. There are service plans which cost next to nothing, a domain is configured for you and at the end of the publish the website opens and is working. The whole process has taken less than 10 minutes.

Connecting Sitecore to Salesforce

Connecting Sitecore to Salesforce

When it comes to creating CMS websites the most common integration requirement is one to a CRM system. In this article I’m going to explain the different methods of integrating Sitecore and Salesforce along with the advantages and disadvantages of each.

Salesforce Web to Lead

The first option isn’t Sitecore specific at all and is functionality provided and supported by Salesforce. Web to Lead allows you to create a HTML form from within Salesforce that when submitted will create a Lead item in Salesforce. To add it to your Sitecore instance can be as simple as copy and pasting the generated HTML and setting up a confirmation page for Salesforce to redirect to once the form has been submitted.

The process for the form submission is then as follows:

With some extra technical input, as the HTML for the form is hosted on your site rather than through an iFrame, it is also possible to customise it with your own CSS and validation to fit in with the rest of your site.

Advantages

  • Very simple to set up
  • Very low in terms of complexity
  • Easy to maintain
  • No duplication of data in Sitecore (simpler for GDPR)
  • Potential for form to be CMS managed by pasting in HTML from Salesforce

Disadvantages

  • Vulnerable to spam
  • No data is fed back into Sitecore analytics
  • Limited to form submissions
  • Limited functionality

For more information on web to lead see: https://www.salesforce.com/products/guide/lead-gen/web-to-lead/

Salesforce Web to Lead plus some custom code

What’s good about web to lead is it provides a simple platform that can be built upon to fix some of its disadvantages.

With some custom code on the Sitecore side, we can change the process from the visitor submitting a form directly to Salesforce, to a form submitting to Sitecore which in turn does an HTTP Post to Salesforce. This gives us the ability to also add in some protection against spam using something like re-captcha and the ability to record some analytics information in Sitecore analytics.

The process for the form submission is as follows:

Advantages

  • Low complexity
  • Easy to maintain
  • No duplication of data in Sitecore (simpler for GDPR)
  • Protected against spam
  • Built on top of Salesforces web to lead functionality

Disadvantages

  • To make fully CMS-able requires a larger bespoke implementation
  • Limited to form submissions

Sitecore Connect for Salesforce CRM

If you’re after something greater than simple lead creation from a form on the site, then another option is the Sitecore Connect for Salesforce CRM module from Sitecore. This is built on top of Sitecore’s data exchange framework and works by syncing data between the two platforms.

Out of the box it is set up with some default mappings for contact information, but this can be expanded to included additional facets of information against a contact.

Using this type of integration as new contacts are created on the site, they will be created in salesforce too. Or a contact being in a salesforce campaign could be synced to the Sitecore instance to customise the users experience there.

For more information on Sitecore Connect for Salesforce CRM see: https://doc.sitecore.com/developers/salesforce-connect/20/sitecore-connect-for-salesforce-crm/en/sitecore-connect-for-salesforce-crm-configuration-guide.html

Advantages

  • Built and supported by Sitecore
  • Ability to build a more complex integration on a framework

Disadvantages

  • Much higher complexity
  • No quick wins for common scenarios such as creating a lead

Fuse IT S4S Connector

Since 2009 Fuse IT have been building a Sitecore to Salesforce connector targeting the biggest use cases for connecting the two platforms.

The connector is installed as packages into both Sitecore and Salesforce and provide out the box functionality such as:

  • Sitecore Forms and Web Forms for Marketers field mapping wizard to create leads in Salesforce. Which gives marketers complete freedom to add a form to a site that creates a lead in Salesforce
  • Sitecore analytics integration to sync behavioural data to salesforce leads and contacts
  • Sitecore personalisation for a contact from Salesforce

The package also includes sample code to enable expanding the solutions with bespoke development.

For more information on S4S see: https://www.fuseit.com/products/sitecore-salesforce-connector/

Advantages

  • Out the box functionality that takes advantages of Sitecore’s features
  • Ability to build a more complex integration on a framework

Disadvantages

  • Additional license fee
  • Creates a dependency that may affect upgrades in the future

Fully Bespoke

Often when you suggest a fully bespoke solution people can take an instant reaction to fear it will be costly, hard to maintain or will break. Writing traditional code often can get viewed as being bad or hard, but the reality is most alternative implementations still involve writing code but just have a nice interface at the start. Depending on the kind of integration your trying to achieve going fully bespoke might be the easiest solution. After all code is essentially an easy tool to achieve a complex scenario and it is generally the scenario which leads to the problem being hard, rather than a developer’s ability to write code in any given language.

A bespoke solution could be something as simple as a webservice being provided by a Salesforce developer that Sitecore will post to, much in the same way as web to lead, but capable of achieving some additional functionality on top.

Or it could be a middleware API layer that is used to connect multiple systems together. E.g. If your site needed to talk to Salesforce and a stock level system at the same time. Or it could be used to completely abstract the CRM system in order to facilitate Salesforce being swapped out with something else in the future.

Advantages

  • Anything’s possible

Disadvantages

  • Could be to complex for simple scenarios
  • Everything must be built
Two ways to import an XML file with .Net Core or .Net Framework

Two ways to import an XML file with .Net Core or .Net Framework

It's always the simple stuff you forget how you do. For years I've mainly been working with JSON files, so when faced with that task of reading an XML file my brain went "I can do that" followed by "actually how did I used to do that?".

So here's two different methods. They work on .Net Core and theoretically .Net Framework (my project is .Net Core and haven't checked that they do actually work on framework).

My examples are using an XML in the following format:

1<?xml version="1.0" encoding="utf-8"?>
2<jobs>
3 <job>
4 <company>Construction Co</company>
5 <sector>Construction</sector>
6 <salary>£50,000 - £60,000</salary>
7 <active>true</active>
8 <title>Recruitment Consultant - Construction Management</title>
9 </job>
10 <job>
11 <company>Medical Co</company>
12 <sector>Healthcare</sector>
13 <salary>£60,000 - £70,000</salary>
14 <active>false</active>
15 <title>Junior Doctor</title>
16 </job>
17</jobs>

Method 1: Reading an XML file as a dynamic object

The first method is to load the XML file into a dynamic object. This is cheating slightly by first using Json Convert to convert the XML document into a JSON string and then deserializing that into a dynamic object.

1using Newtonsoft.Json;
2using System;
3using System.Collections.Generic;
4using System.Dynamic;
5using System.IO;
6using System.Text;
7using System.Xml;
8using System.Xml.Linq;
9
10namespace XMLExportExample
11{
12 class Program
13 {
14 static void Main(string[] args)
15 {
16 string jobsxml = "<?xml version=\"1.0\" encoding=\"utf-8\"?><jobs> <job><company>Construction Co</company><sector>Construction</sector><salary>£50,000 - £60,000</salary><active>true</active><title>Recruitment Consultant - Construction Management</title></job><job><company>Medical Co</company><sector>Healthcare</sector><salary>£60,000 - £70,000</salary><active>false</active><title>Junior Doctor</title></job></jobs>";
17
18 byte[] byteArray = Encoding.UTF8.GetBytes(jobsxml);
19 MemoryStream stream = new MemoryStream(byteArray);
20 XDocument xdoc = XDocument.Load(stream);
21
22 string jsonText = JsonConvert.SerializeXNode(xdoc);
23 dynamic dyn = JsonConvert.DeserializeObject<ExpandoObject>(jsonText);
24
25 foreach (dynamic job in dyn.jobs.job)
26 {
27 string company;
28 if (IsPropertyExist(job, "company"))
29 company = job.company;
30
31 string sector;
32 if (IsPropertyExist(job, "sector"))
33 company = job.sector;
34
35 string salary;
36 if (IsPropertyExist(job, "salary"))
37 company = job.salary;
38
39 string active;
40 if (IsPropertyExist(job, "active"))
41 company = job.active;
42
43 string title;
44 if (IsPropertyExist(job, "title"))
45 company = job.title;
46
47 // A property that doesn't exist
48 string foop;
49 if (IsPropertyExist(job, "foop"))
50 foop = job.foop;
51 }
52
53 Console.ReadLine();
54 }
55
56 public static bool IsPropertyExist(dynamic settings, string name)
57 {
58 if (settings is ExpandoObject)
59 return ((IDictionary<string, object>)settings).ContainsKey(name);
60
61 return settings.GetType().GetProperty(name) != null;
62 }
63 }
64}
65

A foreach loop then goes through each of the jobs, and a helper function IsPropertyExist checks for the existence of a value before trying to read it.

Method 2: Deserializing with XmlSerializer

My second approach is to turn the XML file into classes and then deserialize the XML file into it.

This approch requires more code, but most of it can be auto generated by visual studio for us, and we end up with strongly typed objects.

Creating the XML classes from XML

To create the classes for the XML structure:

1. Create a new class file and remove the class that gets created. i.e. Your just left with this

1using System;
2using System.Collections.Generic;
3using System.Text;
4
5namespace XMLExportExample
6{
7
8}

2. Copy the content of the XML file to your clipboard
3. Select the position in the file you want to the classes to go and then go to Edit > Paste Special > Paste XML as Classes

If your using my XML you will now have a class file that looks like this:

1using System;
2using System.Collections.Generic;
3using System.Text;
4
5namespace XMLExportExample
6{
7
8 // NOTE: Generated code may require at least .NET Framework 4.5 or .NET Core/Standard 2.0.
9 /// <remarks/>
10 [System.SerializableAttribute()]
11 [System.ComponentModel.DesignerCategoryAttribute("code")]
12 [System.Xml.Serialization.XmlTypeAttribute(AnonymousType = true)]
13 [System.Xml.Serialization.XmlRootAttribute(Namespace = "", IsNullable = false)]
14 public partial class jobs
15 {
16
17 private jobsJob[] jobField;
18
19 /// <remarks/>
20 [System.Xml.Serialization.XmlElementAttribute("job")]
21 public jobsJob[] job
22 {
23 get
24 {
25 return this.jobField;
26 }
27 set
28 {
29 this.jobField = value;
30 }
31 }
32 }
33
34 /// <remarks/>
35 [System.SerializableAttribute()]
36 [System.ComponentModel.DesignerCategoryAttribute("code")]
37 [System.Xml.Serialization.XmlTypeAttribute(AnonymousType = true)]
38 public partial class jobsJob
39 {
40
41 private string companyField;
42
43 private string sectorField;
44
45 private string salaryField;
46
47 private bool activeField;
48
49 private string titleField;
50
51 /// <remarks/>
52 public string company
53 {
54 get
55 {
56 return this.companyField;
57 }
58 set
59 {
60 this.companyField = value;
61 }
62 }
63
64 /// <remarks/>
65 public string sector
66 {
67 get
68 {
69 return this.sectorField;
70 }
71 set
72 {
73 this.sectorField = value;
74 }
75 }
76
77 /// <remarks/>
78 public string salary
79 {
80 get
81 {
82 return this.salaryField;
83 }
84 set
85 {
86 this.salaryField = value;
87 }
88 }
89
90 /// <remarks/>
91 public bool active
92 {
93 get
94 {
95 return this.activeField;
96 }
97 set
98 {
99 this.activeField = value;
100 }
101 }
102
103 /// <remarks/>
104 public string title
105 {
106 get
107 {
108 return this.titleField;
109 }
110 set
111 {
112 this.titleField = value;
113 }
114 }
115 }
116
117}

Notice that the active field was even picked up as being a bool.

Doing the Deserialization

To do the deserialization, first create an instance of XmlSerializer for the type of the object we want to deserialize too. In my case this is jobs.

1 var s = new System.Xml.Serialization.XmlSerializer(typeof(jobs));

Then call Deserialize passing in a XML Reader. I'm creating and XML reader on the stream I used in the dynamic example.

1 jobs o = (jobs)s.Deserialize(XmlReader.Create(stream));

The complete file now looks like this:

1using System;
2using System.IO;
3using System.Text;
4using System.Xml;
5
6namespace XMLExportExample
7{
8 class Program
9 {
10 static void Main(string[] args)
11 {
12 string jobsxml = "<?xml version=\"1.0\" encoding=\"utf-8\"?><jobs> <job><company>Construction Co</company><sector>Construction</sector><salary>£50,000 - £60,000</salary><active>true</active><title>Recruitment Consultant - Construction Management</title></job><job><company>Medical Co</company><sector>Healthcare</sector><salary>£60,000 - £70,000</salary><active>false</active><title>Junior Doctor</title></job></jobs>";
13
14 byte[] byteArray = Encoding.UTF8.GetBytes(jobsxml);
15 MemoryStream stream = new MemoryStream(byteArray);
16
17 var s = new System.Xml.Serialization.XmlSerializer(typeof(jobs));
18 jobs o = (jobs)s.Deserialize(XmlReader.Create(stream));
19
20 Console.ReadLine();
21 }
22 }
23}

And thats it. Any missing nodes in your XML will just be blank rather than causing an error.