How to Preview Document or File in a Browser for SaaS
The preview of the documents is something that most people do daily. However, if you want to create such a feature, googling 'How to Preview Document or File in a Browser' might not be enough to find a development solution for their project. We check it ourselves and couldn't find a clear explanation of this process. In this article, we will analyze how to create a web application to solve such a problem for different types of documents that can be used in SaaS development.
Why do you need to view doc file in browser?
One of the main reasons is the functionality will reduce the time needed to search and check documents. Often, users are reluctant to download documents because it clogs their devices with data. Also, not everyone has the necessary software to view documents of various formats.
The developed component will be used to preview the loaded documents and documents generated from the template.
Why view doc file in browser for Fintech SaaS Solutions
Often, in order to comply with reporting in accounting and ERP systems, there is a need to attach files. These are situations when it is not possible to completely get rid of the paper and switch to digital. In such cases, it is necessary to store scans and, accordingly, the doc preview solves a long search for the needed document.
Any systems aimed at transforming paper documents into digital information - as a rule, require scanning and recognition. For example, personal finance accounting systems.
Why view pdf file in a web browser for CRM, HR and The Rest of Human Resource Systems
Information about a client or an employee in such systems is the main product of storage and it is often required to attach a scan of a passport or a cooperation agreement, where a file preview can speed up the process of finding the necessary information.
Why display word document in HTML for Legaltech Systems
At the moment, this is one of the most lagging systems in terms of technical development. They are quite conservative in their work process, so they still face a lot of papers. Most often, users come across documents in the system in two cases:
- for easy storage, accounting and printing at the right time;
- for document recognition and translation into a digital version;
Legaltech systems in which you need to store and manage any documents, including separate templates, and separately filled ones.
Why preview word document in browser for the Healthcare Industry
The Hospital Management System most often encounters a large number of documents, images (X-ray, ultrasound, patient's photo, etc.). In addition to different types of documents, they also more often than usual have unusual formats that cannot be processed using classical solutions.
Is it secure to open doc in browser for preview?
We would like to draw your attention to the fact that in the off-the-shelf solutions have a big security issue since you have to send files to third-party services, so our team solves this problem on its own in order to save data on a personal server.
How do we ensure the safety of your data:
File transfer occurs via HTTPS / SSL;
Safe storage with Amazon S3, Google Cloud or Microsoft Azure;
For secure storage solutions, our team prefers to work with Amazon S3 due to features such as:
Low latency and high throughput;
Storing objects with 99.999999999% reliability across multiple AZs;
Resilient to events affecting the entire Availability Zone;
Estimated 99.99% availability throughout the year;
Availability guaranteed by Amazon S3 Service Level Agreement;
SSL support for data transfer and data encryption at rest;
S3 lifecycle management to automatically migrate objects to other S3 storage classes.
Off-the-shelf solutions: pros and cons
There are free online doc previews. Unfortunately, there are only two and none of them is perfect, but they greatly expand the possibilities. So here we have our money savers: Google Docs and Office Web Apps.
Preview files with Google Docs Viewer
This is not an official solution, this means that Google nowhere gives you documentation on how to properly use this, but developers somehow found it out anyway, despite that Google Docs Viewer isn’t supported anymore it still works!
Many supported file types, probably you’ll find every file type you would like to preview: images, videos, text, code, Microsoft Office file types, pdfs, Adobe file types, svgs, font file types, archive file types and more;
25MB file limit;
Works on every popular desktop and mobile browser which is very important if you want to make a preview on mobile devices.
Along with the lack of support from Google it likes to throw random errors which will result in no preview at all, what’s more… there’s no way of checking if it failed or not, your inline embedder won’t give you any information about it (no browser event or anything);
As you might know, Microsoft file types like .ppt, .doc, .xls, etc. are not Google file types so… It has some problems with displaying it, but don’t worry It’s not like they’ll not show up at all, just for example in .doc’s files, some images might jump into the next line/page instead of showing in a row.
Preview files with Office Web Apps
Microsoft also gives its solution to preview files on your website, surely it’s the best option for Office types files because it’s the best at parsing them into HTML.
Faster loading than Google Docs;
Always successfully displays the result - no random errors;
Most accurate .docs and .ppts parser.
Supports only Microsoft Office file types: .ppt(x), .doc(x), xls(x);
10MB limit for docs/ppts, 5MB for xls;
Low (or none) support for mobiles, throws errors, doesn't display anything and it's not responsive below ~700px width.
Perhaps, you think that it is more complicated and expensive to create a custom solution of document and file preview. We can assure you that it is not that scary as it sounds plus it is much more secure especially if you deal with personal info.
Our Experience in custom document previewer for browser
For the project we encountered, it was required to implement a component for previewing documents of the following formats: jpeg, png, tiff, pdf, xls, xlsx, doc, docx.
Wherein, the component must have the following functionality:
Page through the document, scroll;
Enlarge / Reduce Document Page;
Since the documents are confidential, they should not be processed on third-party resources.
The diagram below shows two processes. The first is loading documents with their conversion. The second is opening documents for preview. They can happen one after another, depending on the interface, but also loading and converting can happen only once, and the preview can be performed repeatedly.
In the first case, the user uploads the document to the server, which is saved in the file server as an original. Then the webserver sends the document to Gotenberg, which converts it into pdf. As a result, we have two documents - original and converted. If the original is not needed, it can be deleted.
When the document has already been saved in pdf, the webserver writes all the necessary information to the database to bind the path to this document, depending on the task.
Next, we had to develop a second process, where there is a document template and it must be filled in automatically. In this case, the user sends the ID to the webserver and the required document. After receiving the template, the webserver starts replacing the keys with user data that we get from the database.
The final version goes to Gothenberg, where it is converted to PDF. After receiving the file, the web server sends it to the file server to save. As in the previous case, the webserver records the path to the document in the database. Thus, at the next request from the user, a ready document will be found for further use.
Now let’s review the development process in detail.
At first, we faced an issue that the browser does not support all formats. That means, the browser can only show pictures and PDFs, but the rest cannot (tables, Word, tiff). For any exotic, the system can be modified by adding a converter from it to PDF.
For implementation, we adhered to the following principle of work:
To display a PDF document, the @mikecousins/react-pdf component is used;
If you need to display a picture in PNG or JPEG format, then the jsPDF library is used, which creates a PDF file;
If you need to display a picture in TIFF format, then the tiff library is used, which converts the image and transfers the data to jsPDF;
The print-js library is used to print PDF files.
In this case, we used libraries such as:
- jsPDF - used to create pdf files;
@mikecousins/react-pdf - react component, used to display pdf files;
TIFF - converts a tiff image to canvas;
Print-js - print PDF files;
File-saver - for downloading files.
Many libraries did not fit - they had problems with encoding in different languages, formatting was not fully supported (colors disappeared, text indents were violated, italic font, etc.).
The solution was the open-source project Gotenberg, which works like a charm, it is based on the LibreOffice engine, so it does not have such problems. Gotenberg is a Docker-powered stateless API for converting HTML, Markdown, and Office documents to PDF.
Our team had two tasks: just display the uploaded documents, and secondly, display templates with an attached database, with which the template is filled in automatically.
Since the generation of documents based on a template occurs in Docx format, which is essentially an archive with XML files, we use the Docx4j library - opens documents from open office and MS office and then you can work with them in Java, you can modify the document and then save it to the server.
The docx, xlsx file format is a zip archive containing XML text, graphics, and other data.
Summing up, for the server-side implementation, we used technologies such as React, Java, Docker, Gotenberg.
Final Thoughts on how to preview a document in web browser
When looking at the task in detail, the implementation of document previews in a browser does not look like a complicated process if the security conditions are met and the right technologies are selected.
However, if you do not use ready-made solutions to open a word doc in the browser, then it is better to use the help of an experienced team that has already solved a similar problem. Our dedicated development team is ready to assist you in a display word document in the browser as well as file previews for your project. Please contact us for details.