[debian-edu-commits] [Debian Wiki] Update of "DebianEdu/Documentation/en/ITIL/Delivery" by PetterReinholdtsen
Debian Wiki
debian-www at lists.debian.org
Thu Oct 1 09:26:27 UTC 2015
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Debian Wiki" for change notification.
The "DebianEdu/Documentation/en/ITIL/Delivery" page has been changed by PetterReinholdtsen:
https://wiki.debian.org/DebianEdu/Documentation/en/ITIL/Delivery?action=diff&rev1=15&rev2=16
Comment:
Generated from git.
The objective is to have control over the service level and improve the quality of the operational services. By repeating rounds the quality level is determined, monitored and reported. The purpose is to improve the contact between ICT administrators and users, to get an ICT service, to the agreed quality, delivered.
- It is important to have a determined opinion about different types of SLAs. One can choose from many types of agreements. Typically three types :
+ It is important to understand the different types of SLAs. One can choose from many types of agreements. The three most common types are:
* Agreement per service for all customers
* Agreement per customer for all services
@@ -28, +28 @@
=== General checklist ===
* The agreement between the user and the operations of what's actually being measured. This must be seen from the users' perspective and not the ICT services perspective.
- * Measurement for and unambiguousity about the metrics included in the SLA
+ * Measurement and clarity about the metrics included in the SLA
* Decide realistic targets for the service level (there is no point in promising more than one can keep)
* Continuous focus on the control of the service - monitoring and periodic reporting of results achieved
@@ -68, +68 @@
Access to the services. Is best measured as the period of time when one or more services have been unavailable, for example a calendar month. Different levels for different services may be agreed, for example depending on the degree of importance for users.
- Important to emphasise that this is availability within the agreed period of service, not the overall availability all day, all week and all year round (called 24/7/365). For example, it may be agreed that the system should be available between the hours. 8 to 18 on workdays, after that and on weekends it is more uncertain whether one can use the computer system, unless otherwise agreed.
+ Important to emphasise that this is availability within the agreed period of service, not the overall availability all day, all week and all year round (called 24/7/365). For example, it may be agreed that the system should be available between 8:00 and 18:00 on workdays, after that and on weekends it is more uncertain whether one can use the computer system, unless otherwise agreed.
- Availability is also getting support via phone or email. For example, if the Service Desk can be reached between 08 and 16 during the day time, or if it can be reached the whole day, or in the afternoons and evenings, or even during specific weekends.
+ Availability also means getting support via phone or email. For example, whether the Service Desk can be reached between 08 and 16 during the day time, or if it can be reached the whole day, or in the afternoons and evenings, or even during specific weekends.
==== Stability ====
- Is often measured according to the amount of downtimes in a period of time, or the average time between downtimes. One can also measure the time it takes the system to come up again after downtimes.
+ Is often measured according to the amount of downtime in a period of time, or the average time between downtimes. One can also measure the time it takes the system to come up again after downtime.
==== Support ====
- Often measured as response times by phone (for example 1 minute) or email (for example 30 minutes) at requests from users. When the operator gets a request for support, the message will be categorized by severity with a time guarantee for answers. There may also be an agreement about how fast error correction will start, which will depend on what kind of received inquiry.
+ Often measured as response times by phone (for example 1 minute) or email (for example 30 minutes) to requests from users. When the operator gets a request for support, the message will be categorized by severity with a time guarantee for answers. There may also be an agreement about how quickly error correction will start, which will depend on what kind of inquiry was received.
- The support is also about when during the day or night one reach people. Should support be available during school hours between 08 and 16 o'clock, or should one also have support throughout the evening or on weekends. Some will have support also on certain holidays.
+ The support is also about when during the day or night one can reach people. Should support be available during school hours between 08 and 16 o'clock, or should one also have support throughout the evening or on weekends. Some will have support also on certain holidays.
- The period when support is available is usually in the SLA. It is also agreed what support will assist with to a fixed price, and what must be resolved additionally on an assignment basis. The agreement regulates the process of handling inquiries, both what to fix, and when this will happen.
+ The period when support is available is usually in the SLA. It is also agreed what support will be available, with a fixed price, and what must be resolved additionally on an assignment basis. The agreement regulates the process of handling enquiries, both what to fix, and when this will happen.
==== Capacity ====
- Can be measured as the average response time by certain operations in specific applications. Will measure the user experience of the system.
+ Can be measured as the average response time by certain operations in specific applications. Will measure user experience of the system.
==== Change management ====
@@ -102, +102 @@
==== Reporting and follow-up ====
- Description of rules and periods for reporting of measured service levels. It is recommended regular meetings, for example quarterly, to go through the report and plan ahead.
+ Description of rules and periods for reporting of measured service levels. Regular meetings are recommended, for example quarterly, to go through the report and plan ahead.
==== Sanctions and possible incentives ====
@@ -112, +112 @@
== Financial Management ==
- Organisations rarely have a full overview of their ICT spending. A 2001-survey of Norwegian municipalities showed that only 1 of 8 municipalities had an ICT budget. Probably it is not better for school. Putting in place an ICT budget is important. Often users think they pay too much for a service they are not happy with. This creates many times conflicts between users and the ICT department.
+ Organisations rarely have a full overview of their ICT spending. A 2001-survey of Norwegian municipalities showed that only 1 of 8 municipalities had an ICT budget. It is probably not better for schools. Putting in place an ICT budget is important. Often users think they pay too much for a service they are not happy with. This often creates conflicts between users and the ICT department.
- It is very useful for both the operations center and users to document the real ICT costs. Without it is difficult to budget appropriately. Not least, it is difficult to make a cost/benefit assessment of existing ICT solutions. Rector should know the ICT budget as well as she knows salary budget, or the budget of teaching aids.
+ It is very useful for both the operations center and the users to document the real ICT costs. Without this, it is difficult to budget appropriately. And mostly, it is difficult to make a cost/benefit assessment of existing ICT solutions. The rector should know the ICT budget as well as she would know the salary budget, or the budget for the teaching aids.
There are three major key processes related to financial management of ICT services:
@@ -124, +124 @@
=== Budgeting ===
- The objective of the budget is to make a realistic estimate of the expected ICT costs. Budgeting usually contains various alternative solutions. It applies both to equipment and software, and the level you want to lay on. The budget is the starting point for subsequent budget negotiations with the director of education and/or politicians.
+ The objective of the budget is to make a realistic estimate of the expected ICT costs. Budgeting usually contains various alternative solutions. It applies both to equipment and software, and the level you aspire to. The budget is the starting point for subsequent budget negotiations with the director of education and/or politicians.
- Budget must include both personnel and equipment costs. Some organisations count only on costs to buy equipment, omitting as much as 60 - 70 % personnel costs for the operation of an ICT-solution. One must also get all of the equipment.
+ Budget must include both personnel and equipment costs. Some organisations only count the cost to buy equipment, omitting as much as 60 - 70 % in personnel costs for the operation of an ICT-solution. One must also get all of the equipment.
- There are examples of municipalities forgetting to count the cost of power connectors and computer networks in schools. Then you have forgotten about 2000 NOK (10 NOK = 0.85 GBP/1.18 EUR) per client machine. Should we put in place 70 new computers we talk soon about 140,000 NOK to computer networks and power.
+ There are examples of municipalities forgetting to count the cost of power connectors and computer networks in schools. Then you have forgotten about 2000 NOK (10 NOK = 0.85 GBP/1.18 EUR) per client machine. For 70 new computers, we need about 140,000 NOK for computer networks and power.
- Alternative solutions are also important to include in the budget. This applies both for the operation and the equipment. Today there are several vendors who specialize in the operation of computer equipment in schools with varying prices and quality. Number of simultaneous users and type machines and software to be maintained, also means a lot.
+ Alternative solutions are also important to include in the budget. This applies both for the operation and the equipment. Today there are several vendors who specialize in the operation of computer equipment in schools with varying prices and quality. The number of simultaneous users, and type of machines and software to be maintained, is important.
If one would like to have laptops for all teachers and students one will easily get 5-6 times higher costs than if one had desktops with three students for each client machine.
=== Accounting ===
- The accounts will mainly consist of invoices for purchased equipment, cabling, repair, operation and extra services. When the accounting period is over, it is important to go through the numbers and compare this with the budget.
+ Accounting will mainly consist of invoices for purchased equipment, cabling, repair, operations and extra services. When the accounting period is over, it is important to go through the numbers and compare this with the budget.
=== Planning the accounting and billing ===
- Not all municipalities have accounting that shows ICT costs broken down by each school. There may be practical reasons for this, such as discounts and the like that the municipality gets centrally. Therefore it is important to do some planning so that you get an overview of what costs have been for operating and procurement when the accounts should be assessed against the budget.
+ Not all municipalities have accounting systems that show ICT costs detailed by school. There may be practical reasons for that, such as discounts and similar that the municipality gets centrally. Therefore it is important to do some planning so that you get an overview of what were the costs for operations and procurement when the accounting is assessed against the budget.
- Some organisations may have cumbersome and costly accounting procedures. You get fast extra charge to pay bills by delays, or if you have many who shall approve a payment. It is important to agree on good billing practices in the procurement and the operation. For both having control, as well as handling payments on time without long decision paths.
+ Some organisations may have cumbersome and costly accounting procedures. You quickly get extra charges if you pay bills late, or there are many who must approve a payment, for instance. So it is important to agree on good billing practices in procurement and operations in order to have control, as well as to handle payments on time without long decision processes.
=== Implementation ===
- The payment method is regulated by the SLA. One must agree with the finance department for a convenient way to get reports from the accounting, to get the necessary accounting overview of ICT costs without it takes a long time to get out the overview.
+ The payment method is regulated by the SLA. When it gets to the accounting system, one must agree with the finance department for a convenient way to get out the reports, in order to get the necessary accounting overview of ICT costs without it taking a long time to be generated.
=== Daily operation ===
- Regarding contracts one will usually have a fixed monthly billing consisting of a fixed amount and possible additional services. Billing is done from accounting office based on the current operating agreements, and the extra services performed. It is important to have good and frequent contact with the accounting service based on the tasks carried out for the customer.
+ Regarding contracts one will usually have a fixed monthly billing consisting of a fixed amount and possible additional services. Billing is done from the accounting office based on the current operations' contracts, and the extra services performed. It is important to have good and frequent contact with the accounting service based on the tasks carried out for the customer.
== Capacity Management ==
- Capacity planning is used to ensure that all parts of the ICT solution has sufficient capacity to safeguard users' requirements. This includes:
+ Capacity planning is used to ensure that all parts of the ICT solution have sufficient capacity to safeguard users' requirements. This includes:
- * Monitoring the performance of ICT services including infrastructure
+ * Monitoring the performance of ICT services and their related infrastructure
* Configuration of the systems to ensure they are optimally utilised to what the users actually do
- * Understanding user needs and planning for possible changes in the systems to take care of future needs
+ * Understanding the user needs and planning for possible changes in the systems to take care of future needs
* Resource planning in cooperation with the budget officer
- * Preparation of a capacity plan to ensure delivery of operations in accordance with the agreed service level
+ * Preparation of a capacity plan to ensure delivery of operations in accordance with the agreed upon service level
Capacity planning is all about balance:
- * Costs against capacity. The budget limits what kind of possible solutions to implement
+ * Costs against capacity. The budget limits what kind of possible solutions one can implement
- * Supply and demand. The systems must have capacity to handle the demands set of the users
+ * Supply and demand. The systems must have the capacity to handle the demands set by the users
The objective of capacity planning is to avoid surprises.
@@ -191, +191 @@
On the basis of data collected from monitoring routines, one tries to identify any bottlenecks in the systems. Examples:
- * Poor or varying utilization of hardware
+ * Poor or varying utilization of the hardware
* Poorly designed software
* Poor utilization of memory capacity
* Bottlenecks on data storage, memory or processor
@@ -205, +205 @@
||'''Bottlenecks'''||'''Actions'''||
||Missing sound, USB stick support and DVD on thin clients.||Install diskless workstations (> 800 MHz processor, > 256 MB RAM)||
- ||Has 60 thin clients connected to the server and wants more PCs.||Go for diskless clients, or install another a thin client server||
+ ||Has 60 thin clients connected to the server and wants more PCs.||Go for diskless clients, or install another thin client server||
||Thin clients run slowly after we expanded with 20 pieces without acquiring a new server machine||Install 2GB more memory on the server machine||
- ||Thin clients with 32MB memory does not start after upgrading to Skolelinux 2.0||Turn on cache (swap) of the thin clients, or downgrade to LTSP 4.2 which is set up with swap.||
+ ||Thin clients with 32MB memory do not start after upgrading to Skolelinux 2.0||Turn on swapping on the thin clients, or downgrade to LTSP 4.2 which is set up with swap.||
||Flash animations make the thin clients slow when 50 students are logged into the same server machine||Install diskless clients||
=== Implementation ===
- Implementation of possible changes the system configuration must be done in accordance with the guidelines set for changes of the system. A well-planned test of function and performance must also be done before changes can be made in the production system. Testing is done to avoid operational disturbances when changes are set into production.
+ Implementation of possible changes to the system configuration must be done in accordance with the guidelines set for changes of the system. A well-planned function and performance test must also be done before changes can be made in the production system. Testing is done to avoid operational disturbances when changes are set into production.
- === Making of the capacity plan ===
+ === Preparing the capacity plan ===
A capacity plan is basically an investment plan for the ICT system based on knowledge of the users' current needs and future plans.
@@ -239, +239 @@
* Availability of technical components
* Failure tolerance
* Quality of maintenance and support
- * Procedures and routines for handling operational services
+ * Procedures and routines for processing operational services
* Security, integrity and availability of data
Availability can be measured in several ways. But before we show examples we'll point out what may be difficult targeting figures. If we should make systematic efforts to availability, we have to clarify what the different things mean. What means for example a percentage of availability.
- Let's say a "computer with computer program" is a service. If the computer program does not work one day, then the service unavailable if all the other programs work fine. What if the computer program is unavailable for a classroom, but available on the rest of the school (because of an underlying service). This is difficult matter to clarify and work on in practice.
+ Let's say a "computer with computer program" is a service. If the computer program does not work one day, then the service is unavailable if all the other programs work fine. What if the computer program is unavailable for a classroom, but available for the rest of the school (because of an underlying service). This is a difficult matter to clarify and work on in practice.
- === Measures for availability ===
+ === Availability measurements ===
Availability can be measured using several methods. Here are some examples:
||'''Value'''||'''Meaning'''||
- ||% available||The value can be availability between hours 08:00 and 18:00. If the system is down 1 hours during one day, than the system is available in 90% of the agreed upon time. If availability is measured over a month with 20 work days, then the system is available 95% of the time.||
+ ||% available||The value can be availability between hours 08:00 and 18:00. If the system is down 1 hour during one day, than the system is available in 90% of the agreed upon time. If availability is measured over a month with 20 work days, then the system is available 95% of the time.||
||% unavailable||Is the system down one hour during an agreed uptime, for example 10 hours a day, the system is unavailable in 10% of the time. Measured over 20 days, we may assume the system has been unavailable for 5% of the time.||
- ||Hours unavailable||One can agree to the number of times one accepts the system is unavailable during, for example within one month (20 days). It can be a maximum of one hour halt in the period, and between 08:00 until 18:00.||
+ ||Downtime||One can agree on the number of times one accepts the system to be unavailable during, for example, one month (20 days). It can be a maximum of one hour unavailability in that period, and between 08:00 until 18:00.||
- ||Error frequency||Even error rate can be measured per day or every month. 3 errors in the month and that the system is down between 08:00 until 18:00, is an example.||
+ ||Error frequency||Even error frequency can be measured per day or for each month. 3 errors in the month because the system was down between 08:00 and 18:00, is an example.||
- ||Error consequences||Measured values are a common starting point for judging whether an error to have consequences beyond ordinary error correction. The customer or the school for example, may ask to pay less for the operating agreement for the current month.||
+ ||Error consequences||Measured values are a common starting point for judging how to respond to an error beyond ordinary error correction. The customer or the school for example, may ask to pay less for the operating agreement for the current month.||
The most important is that your measurements describe the user experience in the best possible way. Therefore, one should measure what is important for the user.
- The feedback from schools is that printers gives most problems. This includes everything from the print queue has stalled, to missing paper or toner. Some have also experienced some instability with the browser, and that !OpenOffice.org suite is hanging. It may happen when your broadband connection is unstable and you have links in documents going to the Internet.
+ The feedback from the schools is that printers give most problems. This includes everything from the print queue has stopped, to missing paper or toner. Some have also experienced some instability with the browser, and that !OpenOffice.org suite is hanging. It may happen when your broadband connection is unstable and you have links in documents going to the Internet.
=== Infrastructure ===
- To have a stable computer system is dependent on a good enough technical quality of the network. Several schools have experienced instability because the physical computer network is provisional and of poor quality.
+ To have a stable computing environment, one is dependent on a good enough technical quality of the network. Several schools have experienced instability because the physical computer network is provisional and of poor quality.
Today many invest in wireless networks. Doing so, one must also be aware of wireless networks having significant weaknesses. Wireless networks have limited capacity. It can be quite choppy when about 30 students are to see a film from the Internet simultaneously. Wireless networks also have shadows. Meaning areas may not get coverage, which allows some to end up in blind zones. This would provide poor or no net connection at all.
- Should you set requirements to access, it is normally done by the operating company and ICT services to require good quality computer network at school.
+ Availability requirements for the maintenance company and ICT service providers should specify good quality of network services to schools.
=== «Single points of failure» ===
- Usually parts of a data solution must just work. Fails, for example, a firewall and stop working, stops all traffic to the Internet. One may also have problems with the stability of the system for allocating network addresses using DHCP (Dynamic Host Configuration Protocol).
+ Some parts of the system simply must work. Failures in a firewall, for example, may compromise security or (if you're lucky) shut down the whole network. This last can also result from problems with the DHCP (Dynamic Host Configuration Protocol) system for sharing out addresses.
- The operating department's responsibility is to know of the parts that may stop the entire data solution. It is important to find these points, and remove the errors one by one, if this is something you can afford. If one can't afford to remove sources of errors stopping for example. the entire computer network, one must live with the risk for something suddenly does not.
+ The operating department has a responsibility to know which parts may stop the entire system. It is important to find these points, and remove the errors one by one, to the extent you can afford. If one can't afford to remove these sources of errors, one must live with the risk of the entire computer network suddenly grinding to a halt.
- Sources of error making everything stop, may also be logical rather than physical. This is especially true for computer networks and databases. So it is important to have a broader perspective when it comes to such errors.
+ Sources of errors making everything stop, may also be logical rather than physical. This is especially true for computer networks and databases. So it is important to have a broader perspective when it comes to such errors.
=== Risk management ===
- One must consider what one accepts of risks in the network. Is it acceptable that users lose personal files and data, when a hard drive fails? How quickly should one replace broken equipment? Some schools have experienced it takes several days to get the server up and go after a virus attack. The municipality has not resources to allocate to fix errors.
+ One must consider what risks one accepts in the network. Is it acceptable that users lose personal files and data, when a hard drive fails? How quickly should one replace broken equipment? Some schools have spent several days getting a server up and running again after a virus attack. The municipality may have no resources to allocate to fix errors.
- Much of operation goes on to maintain the agreed service level. It's about avoiding and lose confidence and user satisfaction. Risk management is about having in place the appropriate resources to keep the entire computer system on the air, and have resources ready if something should go wrong, and needs to be fixed.
+ Much of the operations work goes on to maintain the agreed service level. It's about avoiding and losing confidence and user satisfaction. Risk management is about having in place the appropriate resources to keep the entire computer system on the air, and have resources ready if something should go wrong, and needs to be fixed.
=== Testing ===
- It is a big difference to install equipment and software on a single PC and hundreds, even thousands of computers. With responsibility for hundreds of machines a small error, one can live with on a PC, mean much instability and discontent if the error affects hundreds of users.
+ It is a big difference to install equipment and software on a single PC and to do it on hundreds, even thousands of computers. With the responsibility for hundreds of machines, a small error that one can live with on a PC, means much instability and discontent if it affects hundreds of users.
- To avoid making mistakes during installation and contributes to stability, it is essential to test equipment and software to be used. It's is to follow up the expected quality. If you want a stable operation one must often choose next to the last edition of equipment and software.
+ To avoid making mistakes during installation and to contribute to stability, it is essential to test the equipment and software to be used. It's about following up the expected quality. If you want stable operations one must often choose next to the last edition of equipment and software.
- One should avoid adopting software ending with a zero. For example you should avoid !OpenOffice.org 4.0. One should adopt the office program when version 4.0.2 has arrived or later. Then the program has been fixed for several errors. The same applies to hardware.
+ One should avoid adopting software with a version number ending with a zero. For example you should avoid !OpenOffice.org 4.0. One should adopt the office program when version 4.0.2 has arrived or later. Then the program has been fixed for several errors. The same applies to hardware.
- Server machines have usually a slightly older version of processors, and more robust memory, and hard drives. This is because many people use this hardware simultaneously. A small error that would not mean anything for one user, can provide downtime if 30 users logged into the machine.
+ Server machines have usually a slightly older version of processors, and more robust memory, and hard drives. This is because many people use this hardware simultaneously. A small error that would not mean anything for one user, can provide downtime if 30 users are logged into the machine.
So testing is about to use proven equipment and editions of software running well a half or a year. Testing is also about trying out the different parts in a smaller but realistic context, to ensure that everything works. Adopting the latest version, or even beta versions of software or completely latest hardware usually lead to much trouble and extra work with maintenance. Setting systems in production without a small test in realistic environments usually lead to significant firefighting and dissatisfied users.
- When testing in a smaller scale on equipment in production, it is essential to arrange this with those affected. In addition, one must choose when to test. One should not test new things, for example, under examinations with use of ICT tools.
+ When testing in a smaller scale on equipment in production, it is essential to coordinate that with those affected. In addition, one must choose when to test. One should not test new things, for example, under ongoing exams, with the use of ICT tools.
=== Design improvements ===
- An operations department will be served by correcting systems that provide much operational messages. It may be users getting much spam. Then it may be OK to install files for spam. There may be a lot of extra work with students who constantly forgets their password, and teachers who send the inquiry to the central sysadmin staff. To avoid extra emailing and double work so the teacher can give the student a new password.
+ It is often worth an operations department's while to enhance systems that produce many operational messages. If users get much spam, then it might be wise to install spam-filters. There might be a lot of extra work with students who constantly forget their passwords, if teachers have to get central sysadmin staff to help them out. To avoid extra emailing and double work, the teacher can be authorised to give the student a new password.
- This was a some examples of design improvements to lighten the work of operation and allows users become more satisfied. A well-run operations department has a list of prioritized improvements in design making operation easier. The priorities is usually done based on an assessment of the inquiry to the service, stored in the message log, and an assessment of the work that must be done to treat the requests.
+ These are two examples of design improvements that simplified maintenance and made users happier. A well-run maintenance team has a prioritised list of such improvements. Prioritising these, as a rule, is based on how often relevant issues show up in the service office's message log and estimates of how much work each improvement shall involve.
=== Planning for availability ===
- It means having realistic expectations to the ICT service based on what operations costs. Plan for what's expected accessibility. For example, when schools require one should be on air in less than one hour after the server crashes, one must have a standing pre-installed machine in reserve, to be inserted as replacement for the faulty machine. It's made during one hour to copy your backup files to the backup machine.
+ It means having realistic expectations to the ICT service based on what operations costs. Plan for what's the expected accessibility. For example, when schools require one should be up and running in less than 1 hour after the server crashes, one must have a standing pre-installed machine in reserve, to be inserted as a replacement for the faulty machine. What should be done during one hour is to copy your backup files to the backup machine.
- Is a diskless or thin client broken a prepared small warehouse of machines and monitors is needed at the school. The school ICT contact can retrieve and install a replacement machine. This can be done easily without waiting days in ordering equipment.
+ For when a diskless or thin client fails, the school should have a small store of machines and monitors prepared. The school ICT contact can fetch and install a replacement machine. This can be done easily without waiting days for an equipment order to be filled.
=== Planning for recovery ===
- As the example of equipment standing ready to replace defective equipment, it is also expected to be able to retrieve lost files and data. Therefore it is crucial to have a backup of user data and a copy of the configuration files. One must also have computer architectural drawings, and descriptions of system, making ICT staff able to quickly install systems when something goes wrong.
+ As for equipment standing ready to replace any that develop defects, users also expect to be able to retrieve lost files and data. Therefore it is crucial to back up user data regularly and keep a copy of the configuration files. One must also have architectural diagrams, and descriptions of systems, to enable ICT staff to quickly set systems up when something goes wrong.
- It is crucial to schedule backup of user data and settings. One must plan ahead in order to have proper equipment and appropriate services. Routines must be planned to be followed when certain error situations occurs and systems must be restored.
+ It is crucial to schedule backup of user data and settings. One must plan ahead in order to have proper equipment and appropriate services. Routines must be planned to be followed when certain error situations occur and systems must be restored.
== Service Continuity ==
- Operating continuity or continuity management is often the most costly part of the work. High demands to operational continuity will require huge investments, which must be agreed in making the SLA. For example it can be agreed that there is no disaster plan for certain services. If you have a disaster plan the value is very low if not tested once in a while. Usually this is expensive. There are examples where customers and management have blocked the engine room and turned off power to test readiness of the IT department.
+ Operating continuity or continuity management is often the most costly part of the work. High demands to operational continuity will require huge investments, which must be agreed upon whilst making the SLA. For example it can be agreed that there is no disaster plan for certain services. If you have a disaster plan the value is very low if not tested once in a while. Usually this is expensive. There are examples where customers and management have blocked the engine room and turned off power to test readiness of the IT department.
- Operating continuity may be appropriate in certain periods like under examinations. Then it may be extra requirements to have equipment with backup ready in case of a hard disk on the server fails. But even this will require considerable additional work for the operational staff.
+ Operating continuity may be more needed in certain periods like under the examination periods. Then extra requirements can be needed in order to have equipments with backup ready in case of a hard disk failure on the server. But even this will require considerable additional work for the operational staff.
- An IT coordinator told us that it might be just as well to postpone the exam one day, if something went wrong with the computer system. This costs a lot less than having a double number of servers at each school. There are examples of schools having had water leakage. Then it is usual to defer examination a day or two to repair the damage . One might think the same way when it comes to school data solution. If you have a backup of home directories for pupils and teachers, you have time to consider without doubling systems at each school. Then it is sufficient with one or two servers in reserve located at the municipality building, which quickly can be moved and connected at the school if something goes wrong.
+ An IT coordinator told us that it might be just as well to postpone the exam one day, if something went wrong with the computer system. This costs a lot less than having a double number of servers at each school. There are examples of schools having had water leakage. Then it is usual to defer examination a day or two to repair the damage . One might think the same way when it comes to school data solutions. If you have a backup of home directories for pupils and teachers, you have time to consider without doubling the systems at each school. Then it is sufficient with one or two servers in reserve located at the municipality building, which quickly can be moved and connected at the school if something goes wrong.
More information about the debian-edu-commits
mailing list